R - reading time data from a text file - r

Hi and thanks in advance,
So I'm currently trying to read a list of test dates from a text file that I want to try and plot on a graph with some test values(known as quantity).
The issue I'm having is that when the times are read from the file, they are not read correctly. When I plot them onto a graph, the values are very distorted and incorrect, and don't display anything remotely resembling a date.
Here is my code:
Frame <- read.table("....path.../Frame.txt")
Frame$Time <- as.Date(Frame$Time)
TheForecast <- naive(Frame)
plot(TheForecast, xlab="Time",ylab="Quantity",main="Stock Quantity vs Time",type='l')
I have tried all the different formats of dates in the text file that I can think of, but they all return the same issue or worse ones.
Here's what I've tried:
Time <- c("01/01/2010", "07/02/2010", "08/03/2010", "02/04/2011", "11/05/2011", "12/06/2011", "06/07/2012", "08/30/2012", "04/16/2013", "03/18/2013", "02/22/2014", "01/27/2014", "12/15/2015", "09/28/2015", "05/04/2016", "11/07/2017", "09/22/2017", "04/04/2017")
Time <- c("2010-01-01", "2010-07-02", "2010-08-03", "2011-02-04", "2011-11-05", "2011-12-06", "2012-06-07", "2012-08-30", "2013-04-16", "2013-03-18", "2014-02-22", "2014-01-27", "2015-12-15", "2015-09-28", "2016-05-04", "2017-11-07", "2017-09-22", "2017-04-04")
Time <- c("1 January 2010", "7 February 2010", "8 March 2010", "2 April 2011", "11 May 2011", "12 June 2011", "6 July 2012", "30 August 2012", "16 April 2013", "18 March 2013", "22 February 2014", "27 January 2014", "15 December 2015", "28 September 2015", "4 May 2016", "7 November 2017", "22 September 2017", "4 April 2017")
Here's the test values for the y (quantity) axis, just for reference:
Quantity <- c(5,3,8,4,0,5,2,7,4,2,6,8,4,7,8,9,4,6)
Here is an example of the file before reading:
Time Quantity
1 2010-01-01 5
2 2010-07-02 3
3 2010-08-03 8
4 2011-02-04 4
5 2011-11-05 0
6 2011-12-06 5
7 2012-06-07 2
8 2012-08-30 7
9 2013-04-16 4
10 2013-03-18 2
11 2014-02-22 6
12 2014-01-27 8
13 2015-12-15 4
14 2015-09-28 7
15 2016-05-04 8
16 2017-11-07 9
17 2017-09-22 4
18 2017-04-04 6
Thank you.

The naive function needs an object of class ts or time-series to display graphics properly.
file_path <- paste0(getwd(),"/data/Frame.txt")
Frame <- read.table(file_path, stringsAsFactors = FALSE)
Frame$Time <- as.Date(Frame$Time, format = "%Y-%m-%d")
Here is your missing step:
Frame <- ts(Frame$Quantity, start = 1, end = NROW(Frame), frequency = 1)
Then proceed with the rest and your time scale should be much more accurate:
library(forecast)
TheForecast <- naive(Frame)
plot(TheForecast, xlab="Time",ylab="Quantity",main="Stock Quantity vs Time",type='l')

The following looks OK to me:
## Your Data
df = read.table(text="Time Quantity
1 2010-01-01 5
2 2010-07-02 3
3 2010-08-03 8
4 2011-02-04 4
5 2011-11-05 0
6 2011-12-06 5
7 2012-06-07 2
8 2012-08-30 7
9 2013-04-16 4
10 2013-03-18 2
11 2014-02-22 6
12 2014-01-27 8
13 2015-12-15 4
14 2015-09-28 7
15 2016-05-04 8
16 2017-11-07 9
17 2017-09-22 4
18 2017-04-04 6",
header=TRUE, stringsAsFactors=FALSE)
## Convert string to date
df$Time = as.Date(df$Time, format="%Y-%m-%d")
plot(df, pch=20)

Related

How to convert monthly data as.date type variable in R?

I have monthly data and want to convert period columns as.date in r.
In addition, rows are not ordered according to time in data frame
df <- data.frame (period = c("March 2019", "February 2019", "January 2019", "May 2019","April 2019","August 2019","June 2019","July 2019","November 2019","September 2019","October 2019","December 2019"),sales = rnorm(12))
period sales
1 March 2019 1.841711557
2 February 2019 0.403043685
3 January 2019 0.524417978
4 May 2019 0.236378511
5 April 2019 -0.099441313
6 August 2019 0.001731664
7 June 2019 0.792067260
8 July 2019 -0.352379347
9 November 2019 1.174681909
10 September 2019 0.075480279
11 October 2019 -0.258695621
12 December 2019 -1.775315927
Using as.Date with appropriate format on pasted 1 to period, then order.
transform(dat, period=as.Date(paste(1, period), '%d %b %Y')) |>
{\(.) .[order(.$period), ]}()
# period sales
# 1 2019-01-01 0.25542882
# 5 2019-02-01 0.11748736
# 10 2019-03-01 0.98889173
# 6 2019-04-01 0.47499708
# 2 2019-05-01 0.46229282
# 8 2019-06-01 0.90403139
# 12 2019-07-01 0.08243756
# 7 2019-08-01 0.56033275
# 4 2019-09-01 0.97822643
# 9 2019-10-01 0.13871017
# 11 2019-11-01 0.94666823
# 3 2019-12-01 0.94001452
Data:
set.seed(42)
dat <- data.frame(period=sample(paste(month.name, 2019)),
sales=runif(12))

Convert character to Date (Thu Jun 14 *** 2018-05-14) in r [duplicate]

This question already has an answer here:
R convert string date (e.g. "October 1, 2014") to Date format
(1 answer)
Closed 4 years ago.
I have a dataframe which is about World Cup matches that include date,location,match_name etc.
In this dataframe I want to convert date column as date in format "2018-05-06"
Here is my file;
date match_name price
1 Thu Jun 14 Russia v Saudi Arabia €453.92
2 Fri Jun 15 Egypt v Uruguay €90.00
3 Tue Jun 19 Russia v Egypt €297.45
4 Wed Jun 20 Uruguay v Saudi Arabia €95.00
and here is my expectation;
date match_name price
1 2018-05-14 Russia v Saudi Arabia €453.92
2 2018-05-15 Egypt v Uruguay €90.00
3 2018-05-19 Russia v Egypt €297.45
4 2018-05-20 Uruguay v Saudi Arabia €95.00
This sure is not the easiest way to do it, But I just wanted you to have a quick answer.
library(stringr)
library(dplyr)
Data=data.frame(date=c("Thu Jun 14","Fri Jun 15","Tue Jun 19","Wed Jun 20"),match_name=c("a","b","c","d"),price=c(1,2,3,4))
Data$date=as.character(Data$date)
regexp <- "[[:digit:]]+"
Data=mutate(Data,datenum=str_extract(Data$date, regexp))
Data=mutate(Data,monthnum=str_extract(Data$date, regexp))
Data=mutate(Data,monthname=str_extract(Data$date,"Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec"))
Data=mutate(Data,monthnum=if(Data$monthname=="Jan")
"01"
else if(Data$monthname=="Feb")
"02"
else if(Data$monthname=="Mar")
"03"
else if(Data$monthname=="Apr")
"04"
else if(Data$monthname=="May")
"05"
else if(Data$monthname=="Jun")
"06"
else if(Data$monthname=="Jul")
"07"
else if(Data$monthname=="Aug")
"08"
else if(Data$monthname=="Sep")
"09"
else if(Data$monthname=="Oct")
"10"
else if(Data$monthname=="Nov")
"11"
else if(Data$monthname=="Dec")
"12"
)
mutate(Data,Final_Date=paste0("2018-",monthnum,"-",datenum))
Resulting in
date match_name price datenum monthnum monthname Final_Date
1 Thu Jun 14 a 1 14 06 Jun 2018-06-14
2 Fri Jun 15 b 2 15 06 Jun 2018-06-15
3 Tue Jun 19 c 3 19 06 Jun 2018-06-19
4 Wed Jun 20 d 4 20 06 Jun 2018-06-20
OK, let's say you have this data.frame:
myDF <-as.data.frame(x=list(date=c("Thu Jun 14","Fri Jun 15","Tue Jun 19","Wed Jun 20")))
Which constructs the following data.frame:
date
1 Thu Jun 14
2 Fri Jun 15
3 Tue Jun 19
4 Wed Jun 20
Assuming that each game is in 2018:
#for handling month abbreviations in English:
Sys.setlocale("LC_TIME", "en_US.UTF-8")
myDF$date <- as.Date(paste0(substr(myDF$date,5,10),", 2018"),format="%b %d, %Y")
The resulting myDF:
date
1 2018-06-14
2 2018-06-15
3 2018-06-19
4 2018-06-20
You can change 2018 to any year you like where necessary.
To convert a variable "date" to the format '2018-05-14', you need to perform the following function:
conv_date <- function(var, year){
var <- as.Date(paste0(var, " ", year), '%a %b %d %Y')
return(var)
}
where:
var - variable in your data table (i.e 'date')
year - the year you need
Example:
yours_df$date <- conv_date(yours_df$date, 2018)

How to fill missing and adjust irregular time intervals in a data.frame in R

I have several datasets mostly with 15 min intervals of time. However, some datasets have missing readings (e.g., 3rd row in sample dataset was supposed to be "May 1 2015 00:40AM". In addition, there are some timesteps that are longer than 15 min (e.g., see 3rd and 6th rows)
How can add the missing time steps so that my Date will continue with 15 min intervals and at the same time adjust those time steps with more than 15 min intervals to 15 min?
s <- data.frame(Date = c(
"May 1 2015 00:10AM","May 1 2015 00:25AM",
"May 1 2015 00:56AM","May 1 2015 01:10AM",
"May 1 2015 01:25AM","May 1 2015 01:41AM",
"May 1 2015 01:55AM"),
val = c(1:7)
)
My desired output would be the following:
> s
Date val
1 May 1 2015 00:10AM 1
2 May 1 2015 00:25AM 2
3 May 1 2015 00:40AM NA
4 May 1 2015 00:55AM 3
5 May 1 2015 01:10AM 4
6 May 1 2015 01:25AM 5
7 May 1 2015 01:40AM 6
8 May 1 2015 01:55AM 7
You could try the following:
First, turn your s dataframe variable "Date" into POSIXct, so you can work with it:
s <- data.frame(Date = c(
"May 1 2015 00:10AM","May 1 2015 00:25AM",
"May 1 2015 00:56AM","May 1 2015 01:10AM",
"May 1 2015 01:25AM","May 1 2015 01:41AM",
"May 1 2015 01:55AM"),
val = c(1:7)
) %>% dplyr::mutate(Date = lubridate::parse_date_time(Date, "b d Y HM"))
Second, you can join this with another data frame that has all the time intervals you are expecting. First, we construct it, using a difference of time intervals (15 mins in this case):
one <- lubridate::parse_date_time("May 1 2015 00:10AM", orders = "b d Y HM")
two <- lubridate::parse_date_time("May 1 2015 00:25AM", orders = "b d Y HM")
dif <- two - one
Now the dataframe:
other_df <- data.frame(
Date = seq(from = lubridate::parse_date_time("May 1 2015 00:10AM",
orders = "b d Y HM"),
to = lubridate::parse_date_time("May 1 2015 01:55AM",
orders = "b d Y HM"),
by = dif))
Join the two:
result <- dplyr::full_join(other_df, s)
> result
Date val
1 2015-05-01 00:10:00 1
2 2015-05-01 00:25:00 2
3 2015-05-01 00:40:00 NA
4 2015-05-01 00:55:00 NA
5 2015-05-01 01:10:00 4
6 2015-05-01 01:25:00 5
7 2015-05-01 01:40:00 NA
8 2015-05-01 01:55:00 7
9 2015-05-01 00:56:00 3
10 2015-05-01 01:41:00 6

Long string date to short date R

I have a df with dates formatted in the following way.
Date Year
<chr> <dbl>
Sunday, Jul 27 2008
Tuesday, Jul 29 2008
Wednesday, July 31 (1) 2008
Wednesday, July 31 (2) 2008
Is there a simple way to achieve the following format of columns and values? I'd also like to remove the (1) and (2) notations on the two July 31 dates.
Date Year Month Day Day_of_Week
2008-07-27 2008 07 27 Sunday
With base R, you can do:
dat <- data.frame(
Date = c("Sunday, Jul 27" ,"Tuesday, Jul 29", "Wednesday, July 31", "Wednesday, July 31"),
Year = rep(2008, 4),
stringsAsFactors = FALSE
)
dts <- as.POSIXlt(paste(dat$Year, dat$Date), format = "%Y %A, %B %d")
POSIXlt provides a list-based reference for the date/time. To see them, try unclass(dts[1]).
From here it can be rather academic:
dat$Month = 1 + dts$mon # months are 0-based in POSIXlt
dat$Day = dts$mday
dat$Day_of_Week = weekdays(dts)
dat
# Date Year Month Day Day_of_Week
# 1 Sunday, Jul 27 2008 7 27 Sunday
# 2 Tuesday, Jul 29 2008 7 29 Tuesday
# 3 Wednesday, July 31 2008 7 31 Thursday
# 4 Wednesday, July 31 2008 7 31 Thursday
library(dplyr)
library(lubridate)
dat = data_frame(date = c('Sunday, Jul 27','Tuesday, Jul 29', 'Wednesday, July
31 (1)','Wednesday, July 31 (2)'), year=rep(2008,4))
dat %>%
mutate(date = gsub("\\s*\\([^\\)]+\\)","",as.character(date)),
date = parse_date_time(date,'A, b! d ')) -> dat1
year(dat1$date) <- dat1$year
# A tibble: 4 × 2
date year
<dttm> <dbl>
1 2008-07-27 2008
2 2008-07-29 2008
3 2008-07-31 2008
4 2008-07-31 2008

Counting number of month between two dates whose class is yearmon?

I need to create a new data frame from my original whose format given below.
MonthFrom MonthTo
Jan 2010 May 2010
Mar 2010 Jan 2012
Jan 2011 Jun 2011
Mar 2010 Jun 2010
Feb 2012 Mar 2012
Feb 2013 Feb 2013 #please note that these two months same.
The example data set above is from my data. I want to create a data frame as below.
Month NumberofMonth
Jan 5
Jan 12
Feb 1
Feb 2
Mar 16
Mar 4
So Generally,the function will count the number of months between two dates (whose class yearmon), and will assign this number to corresponding date. For example, If the number of months in first row is 5 and the MonthFrom in the first row is January, the function will assign the 5 to january. Can anyone help me please?
Given that the zoo type yearmon you're using allows for basic math manipulation and month name extraction with format(), the following should work for you (unless I've missed something in your requirements):
library(zoo)
my.df <- data.frame(
MonthFrom=as.yearmon(c("Jan 2010", "Mar 2010", "Jan 2011", "Mar 2010", "Feb 2012", "Feb 2013")),
MonthTo=as.yearmon(c("May 2010", "Jan 2012", "Jun 2011", "Jun 2010", "Mar 2012", "Feb 2013")))
print(my.df)
## MonthFrom MonthTo
## 1 Jan 2010 May 2010
## 2 Mar 2010 Jan 2012
## 3 Jan 2011 Jun 2011
## 4 Mar 2010 Jun 2010
## 5 Feb 2012 Mar 2012
## 6 Feb 2013 Feb 2013
new.df <- data.frame(
Month=format(my.df$MonthFrom, "%b"),
NumberOfMonth= (my.df$MonthTo - my.df$MonthFrom) * 12 + 1)
print(new.df)
## Month NumberOfMonth
## 1 Jan 5
## 2 Mar 23
## 3 Jan 6
## 4 Mar 4
## 5 Feb 2
## 6 Feb 1

Resources