Convert separate 'year' and 'day' columns into one 'date' column with lubridate? - r

I have one column for 'year' and one column for 'day' (julian day) and I want to use these columns to make a 'date' (2002-12-03 or any format is okay). I've found a lot of lubridate literature on breaking up dates, but am looking for how to put them together.
As simple as it sounds:
year day
2008 1
2008 2
2008 3
2008 4
to
year day date
2008 1 2008-1-1
etc.

You could use base dates rather than lubridate if you like
If this is your sample data
dd<-data.frame(
year = rep(2008,4),
day = 1:4
)
Then you can run
years <- unique(dd$year)
yearstarts <- setNames(as.Date(paste0(years, "-01-01")), years)
newdates <- yearstarts[as.character(dd$year)] + dd$day - 1
Which produces
cbind(dd, newdates)
# year day newdates
# 1 2008 1 2008-01-01
# 2 2008 2 2008-01-02
# 3 2008 3 2008-01-03
# 4 2008 4 2008-01-04
This works because the base Date class stores the number of days since a particular sentinel value. So you when you add 1, you are adding one day to the current date. Here I assumed you may have multiple years to I make sure to correctly calculate the Date value for the first day of each year.

Related

adding two column of a data where col1 contains date and col2 contains days

I have a data frame in which i have two columns date and days and i want to add date column with days and show the result in other column
data frame-1
col date is in format of mm/dd/yyyy format
date days
3/2/2019 8
3/5/2019 4
3/6/2019 4
3/21/2019 3
3/25/2019 7
and i want my output like this
date days new-date
3/2/2019 8 3/10/2019
3/5/2019 4 3/9/2019
3/6/2019 4 3/10/2019
3/21/2019 3 3/24/2019
3/25/2019 7 4/1/2019
i was trying this
as.Date("3/10/2019") +8
but i think it will work for a single value
Convert to actual Date values and then add Days. You need to specify the actual format of date (read ?strptime) while converting it to Date.
as.Date(df$date, "%m/%d/%Y") + df$days
#[1] "2019-03-10" "2019-03-09" "2019-03-10" "2019-03-24" "2019-04-01"
If you want the output back in same format, we can use format
df$new_date <- format(as.Date(df$date, "%m/%d/%Y") + df$days, "%m/%d/%Y")
df
# date days new_date
#1 3/2/2019 8 03/10/2019
#2 3/5/2019 4 03/09/2019
#3 3/6/2019 4 03/10/2019
#4 3/21/2019 3 03/24/2019
#5 3/25/2019 7 04/01/2019
If you get confused with different date format we can use lubridate to do
library(lubridate)
with(df, mdy(date) + days)

using if else statements to manipulate dates [duplicate]

This question already has answers here:
How to add leading zeros?
(8 answers)
Closed 6 years ago.
I am trying to do an if else statement to say if the value is less than 10 add a zero in front, if not leave it as is. I am trying to get all of my dates to be 2 digits. Please assist.
if(df$col < 10){
paste '0'
else df$col
}
I was trying to break it down into different columns
EventID SampleDate SampleTime
130466 3/19/2008 12:30:00
131392 4/30/2008 08:45:00
131658 5/14/2008 10:00:00
117770 6/11/2008 08:45:00
118680 7/23/2008 09:15:00
118903 8/6/2008 09:00:00
SampleDatech year month day2
3/19/2008 2008 3 19
4/30/2008 2008 4 30
5/14/2008 2008 5 14
6/11/2008 2008 6 11
7/23/2008 2008 7 23
8/6/2008 2008 8 6
If you are trying to output just the day with a leading zero to a new column, you can use a combination of strftime and as.Date.
df$day = strftime(as.Date(df$SampleDate, "%m/%d/%Y"), "%d")
Or if you want to keep the whole date, but add the leading zero to the day you can do this.
df$NewDate = strftime(as.Date(df$SampleDate, "%m/%d/%y"), "%m/%d/%Y")

month.abb[] is resulting in incorrect results

I have the following data set. I am trying to split the date_1 field into month and days. Then converting the month number to a month name.
date_1,no_of_births_1
1/1,1482
2/2,1213
3/23,1220
4/4,1319
5/11,1262
6/18,1271
I am using month.abb[] for converting the month number to name. But instead of providing month name for each value of month number, the result is generating wrong array.
for example: month.abb[2] is generating Apr instead of Feb.
date_1 no_of_births_1 V1 V2 month
1 1/1 1482 1 1 Jan
2 2/2 1213 2 2 Apr
3 3/23 1220 3 23 May
4 4/4 1319 4 4 Jun
5 5/11 1262 5 11 Jul
6 6/18 1271 6 18 Aug
below is the code i am using,
birthday<-read.csv("Birthday_s.csv",header = TRUE)
birthday$date_1<-as.character(birthday$date_1)
#split the data
listx<-sapply(birthday$date_1,function(x) strsplit(x,"/"))
library(base)
#convert to data frame
mat<-as.data.frame(matrix(unlist(listx),ncol = 2, byrow = TRUE))
#combine birthday and mat
birthday2<-cbind(birthday,mat)
#convert month number to month name
birthday2$month<-sapply(birthday2$V1, function(x) month.abb[as.numeric(x)])
When I run your code, I get the correct months. However, your code is more complicated than necessary. Here are two ways to extract month and day from date_1:
First, when you read the data, use stringsAsFactors=FALSE, which prevents strings from getting converted to factors.
birthday <- read.csv("Birthday_s.csv",header = TRUE, stringsAsFactors=FALSE)
Extract month and days using date functions:
library(lubridate)
birthday$month = month(as.POSIXct(birthday$date_1, format="%m/%d"), abbr=TRUE, label=TRUE)
birthday$day = day(as.POSIXct(birthday$date_1, format="%m/%d"))
Extract month and days using Regular Expressions:
birthday$month = month.abb[as.numeric(gsub("([0-9]{1,2}).*", "\\1", birthday$date_1))]
birthday$day = as.numeric(gsub(".*/([0-9]{1,2}$)", "\\1", birthday$date_1))

Pandas custom week

I'm trying to define a custom week for a dataframe.
I have a dataframe with timestamps.
I've read the questions on here regarding isocalendar. While it does the job. It's not what I want.
I'm trying to define the weeks from Friday to Thrusday.
For example:
Friday 2nd Jan 2015 would be the first day of the week.
Thursday 8th Jan 2015 would be the last day of the week.
And this would be week 1.
Is there a way to set a custom weekday? so when I access the the datetime library, I get the result that I expect.
df['Week_Number'] = df['Date'].dt.week
Here's one solution - convert your dates to a Period representing weeks that end on Thursday.
In [39]: df = pd.DataFrame({'Date':pd.date_range('2015-1-1', '2015-12-31')})
In [40]: df['Period'] = df['Date'].dt.to_period('W-THU')
In [41]: df['Week_Number'] = df['Period'].dt.week
In [44]: df.head()
Out[44]:
Date Period Week_Number
0 2015-01-01 2014-12-26/2015-01-01 1
1 2015-01-02 2015-01-02/2015-01-08 2
2 2015-01-03 2015-01-02/2015-01-08 2
3 2015-01-04 2015-01-02/2015-01-08 2
4 2015-01-05 2015-01-02/2015-01-08 2
Note that it follows the same convention as datetimes, where week 1 can be incomplete, so you may have to do a little extra munging if you want 1 to be the first complete week.

Extracting last date of the year from a date object

I have following data set:
>d
x date
1 1 1-3-2013
2 2 2-4-2010
3 3 2-5-2011
4 4 1-6-2012
I want:
> d
x date
1 1 31-12-2013
2 2 31-12-2010
3 3 31-12-2011
4 4 31-12-2012
i.e. Last day, last month and the year of the date object.
Please Help!
You can also just use the ceiling_date function in LUBRIDATE package.
You can do something like -
library(lubridate)
last_date <- ceiling_date(date,"year") - days(1)
ceiling_date(date,"year") gives you the first date of the next year and to get the last date of the current year, you subtract this by 1 or days(1).
Hope this helps.
Another option using lubridate package:
## using d from Roland answer
transform(d,last =dmy(paste0('3112',year(dmy(date)))))
x date last
1 1 1-3-2013 2013-12-31
2 2 2-4-2010 2010-12-31
3 3 2-5-2011 2011-12-31
4 4 1-6-2012 2012-12-31
d <- read.table(text="x date
1 1 1-3-2013
2 2 2-4-2010
3 3 2-5-2011
4 4 1-6-2012", header=TRUE)
d$date <- as.Date(d$date, "%d-%m-%Y")
d$date <- as.POSIXlt(d$date)
d$date$mon <- 11
d$date$mday <- 31
d$date <- as.Date(d$date)
# x date
#1 1 2013-12-31
#2 2 2010-12-31
#3 3 2011-12-31
#4 4 2012-12-31
1) cut.Date Define cut_year to give the first day of the year. Adding 366 gets us to the next year and then applying cut_year again gets us to the first day of the next year. Finally subtract 1 to get the last day of the year. The code uses base functionality only.
cut_year <- function(x) as.Date(cut(as.Date(x), "year"))
transform(d, date = cut_year(cut_year(date) + 366) - 1)
2) format
transform(d, date = as.Date(format(as.Date(date), "%Y-12-31")))
3) zoo A "yearmon" class variable stores the date as a year plus 0 for Jan, 1/12 for Feb, ..., 11/12 for Dec. Thus taking its floor and adding 11/12 gets one to Dec and as.Date.yearmon(..., frac = 1) uses the last of the month instead of the first.
library(zoo)
transform(d, date = as.Date(floor(as.yearmon(as.Date(date))) + 11 / 12, frac = 1))
Note: The inner as.Date in cut_year and in the other two solutions can be omitted if it is known that date is already of "Date" class.
ADDED additional solutions.

Resources