as.Date(ww-yyyy) in R - r

I have an Excel-file with a lot of dates in the format "ww-yyyy". When it is loaded into R it is seen as "42370" (for instance).
I need to extract the week and year for each data point x.
I have tried as.POSIXct,as.POSIXlt, as.Date(x, format="%w-%y,origin = 1899-12-30).
as.Date(42370,format =%w%y) gives the output "2023-11-14" which should've been week 53 in 2015

Related

Converting monthly numerics to readable dates in R

How can set R to count months instead of dates when converting integers to dates?
After reading several threads on how to convert dates in R, it seems like nobody has asked how it is possible to convert numeric dates if the numerics is given in monthly timeseries. E.g. 552 represents January 2006.
I have tried several things, such as using as.Date(dates,origin="1899-12-01"), but I reckognize that R counts days instead of months. Thus, the code on year-month number 552 above yields "1901-06-06" instead of the correct 2006-01-01.
Sidenote: I also want the format to be YEARmonth, but does R allow displaying dates without days?
I think your starting date should be '1960-01-01'.
anyway you can solve this problem using the package lubridate.
in this case you can start from a date and add months.
library(lubridate)
as.Date('1960-01-01') %m+% months(552)
it gives you
[1] "2006-01-01"
you can display only the year and month of a date, but in that case R coerces the date into a character.
format(as.Date('2006-01-01'), "%Y-%m")

time series in R with sales prediction with only date values

i have a data with date(2015)with mm/dd/yy format and sales. I need to predict sales for 2016 with the given data. I just know, I need to use time series forecasting. However no idea. Since, many examples have only year like(1960,1970,..) my data has only one year with several months. Don't know how to plot too. can you give me a clear structure how to proceed?
Assuming that the date is in string and in the format mm/dd/yy
convert string into date by using this code
a <- "07/23/15"
b <- as.Date(a, format = "%m/%d/%y")
fullYear <- format(b,'%Y') // to get 2015 as year
halfYear <- format(b, '%y') //to get 15 as year
After this you can work on
I have found the solution. Converted sales figure into time series format.
plotted the data and seen whether there is any trend/Seasonality.
Since the data has only trend applied holts exponential smoothing under forecast package. Sales of 2016 has been found and plotted.

Post-Process a Stata %tw date in R

The %tw format in Stata has the form: 1960w1 which has no equivalent in R.
Therefore %tw dates must be post-processed.
Importing a .dta file into R, the date is an integer like 1304 (instead of 1985w5) or 1426 (instead of 1987w23). If it was a simple time series you could set a starting date as follows:
ts(df, start= c(1985,5), frequency=52)
Another possibility would be:
as.Date(Camp$date, format= "%Yw%W" , origin = "1985w5")
But if each row is not a single date, then you must convert it.
The package ISOweek is based on ISO-8601 with the form "1985-W05" and does not process the Stata %tw.
The Lubridate package does not work with this format. The week() returns the number of complete seven day periods that have occurred between the date and January 1st, plus one. week function
In Stata week 1 of any year starts on 1 January, whatever day of the week that is. Stata Documentation on Dates
In the format %W of Date in R the week starts as Monday as first day of the week.
From strptime %V is
the Week of the year as decimal number (00--53) as defined in ISO
8601. If the week (starting on Monday) containing 1 January has four or more days in the new year, then it is considered week 1. Otherwise,
it is the last week of the previous year, and the next week is week 1.
(Accepted but ignored on input.) Strptime
Larmarange noted on Github that Haven doesn't interpret dates properly:
months, week, quarter and halfyear are specific format from Stata,
respectively %tm, %tw, %tq and %th. I'm not sure that there are
corresponding formats available in R. So far they are imported as
integers.
Is there a way to convert Stata %tw to a date format R understands?
Here is an Stata file with dates
This won't be an answer in terms of R code, but it is commentary on Stata weeks that can't be fitted into a comment.
Strictly, dates in Stata are not defined by the display formats that make them intelligible to people. A date in Stata is always a numeric variable or scalar or macro defined with origin the first instance in 1960. Thus it is at best a shorthand to talk about %tw dates, etc. We can use display to see the effects of different date display formats:
. di %td 0
01jan1960
. di %tw 0
1960w1
. di %tq 0
1960q1
. di %td 42
12feb1960
. di %tw 42
1960w43
. di %tq 42
1970q3
A subtle point made explicit above is that changing the display format will not change what is stored, i.e. the numeric value.
Otherwise put, dates in Stata are not distinct data types; they are just integers made intelligible as dates by a pertinent display format.
The question presupposes that it was correct to describe some weekly dates in terms of Stata weeks. This seems unlikely, as I know no instance in which a body outside StataCorp uses the week rules of Stata, not only that week 1 always starts on 1 January, but also that week 52 always includes either 8 or 9 days and hence that there is never a week 53 in a calendar year.
So, you need to go upstream and find out what the data should have been. Failing some explanation, my best advice is to map the 52 weeks of each year to the days that start them, namely days 1(7)358 of each calendar year.
Stata weeks won't map one-to-one to any other scheme for defining weeks.
More in this article on Stata weeks
It's not completely clear what the question is but the year and week corresponding to 1304 are:
wk <- 1304
1960 + wk %/% 52
## [1] 1985
wk %% 52 + 1
## [1] 5
so assuming that the first week of the year is week 1 and starts on Jan 1st, the beginning of the above week is this date:
as.Date(paste(1960 + wk %/% 52, 1, 1, sep = "-")) + 7 * (wk %% 52)
## [1] "1985-01-29"

How to convert ordinal date day-month-year format using R

I have log files where the date is mentioned in the ordinal date format.
wikipedia page for ordinal date
i.e 14273 implies 273'rd day of 2014 so 14273 is 30-Sep-2014.
is there a function in R to convert ordinal date (14273) to (30-Sep-2014).
Tried the date package but didn come across a function that would do this.
Try as.Date with the indicated format:
as.Date(sprintf("%05d", 14273), format = "%y%j")
## [1] "2014-09-30"
Notes
For more information see ?strptime [link]
The 273 part is sometimes referred to as the day of the year (as opposed to the day of the month) or the day number or the julian day relative to the beginning of the year.
If the input were a character string of the form yyjjj (rather than numeric) then as.Date(x, format = "%y%j") will do.
Update Have updated to also handle years with one digit as per comments.
Data example
x<-as.character(c("14273", "09001", "07031", "01033"))
Data conversion
x1<-substr(x, start=0, stop=2)
x2<-substr(x, start=3, stop=5)
x3<-format(strptime(x2, format="%j"), format="%m-%d")
date<-as.Date(paste(x3, x1, sep="-"), format="%m-%d-%y")
You can use lubridate package as follows:
>library(lubridate)
# Create a template date object
>date <- as.POSIXlt("2009-02-10")
# Update the date using
> update(date, year=2014, yday=273)
[1] "2014-09-30 JST"

number of hours between dates

I have a netCDF file and I am attempting to identify the first date in the file and the 'base date'. The file contains monthly data. My notes, which are fairly old, indicate the first date is January 1, 1948.
The following R code gives the first date as 17067072:
library(ncdf)
library(chron)
my.data <- open.ncdf('my.netCDF.nc')
x = get.var.ncdf(my.data, "lon" )
y = get.var.ncdf(my.data, "lat" )
z = get.var.ncdf(my.data, "time")
z[1:5]
# [1] 17067072 17067816 17068512 17069256 17069976
I downloaded an application called ncdump.exe and after typing the following line in the Windows command window:
C:\Users\Mark W Miller\ncdump>ncdump -h my.netCDF.nc
I learned that the base date is:
time:units = "hours since 1-1-1 00:00:0.0" ;
This same base data is obtained in R using:
att.get.ncdf(my.data,"time","units")$value
[1] "hours since 1-1-1 00:00:0.0"
I tried to verify that with the following R code:
date1 <- as.Date("01/01/0001", "%m/%d/%Y")
date1
# [1] "0001-01-01"
date2 <- as.Date("01/01/1948", "%m/%d/%Y")
date2
# [1] "1948-01-01"
period <- as.Date(date1:date2, origin = "00-01-01")
hours <- 24 * (length(period)-1)
hours
# [1] 17067024
There is a difference of 48 hours between the number in z[1] and the number returned by the R code immediately above:
17067072 - 17067024
[1] 48
Where is my error? Since the netCDF file contains monthly data I doubt the first date is January 3, 1948. The website from which I downloaded the data does not offer the option of selecting the day within month.
The application ncdump.exe can be downloaded from here:
http://www.narccap.ucar.edu/data/ascii-howto.html
If I can figure out how to subset the netCDF file I might upload the smaller file somewhere.
Thank you for any advice.
Have you looked at your period vector? when I looked at the first few and last few values the year comes out as something that does not make sense. Possibly something is messed up in one of the conversions.
Also note that same computer programs treat 1900 as a leap year even though it was not, a difference between 2 programs on that factor could account for 24 of the hours in your difference.

Resources