Adding time to day speficied dates in R - r

I want to add hours to day specified dates. And I want the output to be in date format. I wrote the below code:
day<-as.Date(c("20-01-2016", "21-01-2016", "22-01-2016", "23-01-2016"),format="%d-%m-%Y")
hour<-c("12:00:00")
date<-as.Date(paste(day,hour), format="%d-%m-%Y %h:%m:%s")
However, This code produces NA's:
> date
[1] NA NA NA NA
How can I do this in R? I will be very glad for any help. Thanks a lot.
The below code also doesn't work:
day<-as.Date(c("20-01-2016", "21-01-2016", "22-01-2016", "23-01-2016"),format="%d-%m-%Y")
time <- "12:00:00"
x <- paste(day, time)
x1<-as.POSIXct(x, format = "%d-%m-%Y %H:%M:%S")
It still prodeces NAs:
> x1
[1] NA NA NA NA

You can do either of these two:
dates <- as.Date(c("20-01-2016", "21-01-2016", "22-01-2016", "23-01-2016"), format = "%d-%m-%Y")
time <- "12:00:00"
x <- paste(dates, time)
as.POSIXct(x, format = "%Y-%m-%d %H:%M:%S")
dates <- c("20-01-2016", "21-01-2016", "22-01-2016", "23-01-2016")
time <- "12:00:00"
x <- paste(dates, time)
as.POSIXct(x, format = "%d-%m-%Y %H:%M:%S")
I personally find the second version simpler.

Related

Convert date format from dd-mm-yyyy to dd/mm/yyyy

I would like convert my date format from dd-mm-yyyy to dd/mm/yyyy
Data:
date
1 22-Jul-2020
Current code:
format(as.Date(df$date, '%d:%m:%Y'), '%d/%m/%Y' )
[1] NA NA
Desired Output:
date
1 22/07/2020
The format in as.Date should match the input format. It is %d followed by -, then abbrevation for month (%b) followed by - and 4 digit year (%Y)
df$date <- format(as.Date(df$date, '%d-%b-%Y'), '%d/%m/%Y' )
df$date
#[1] "22/07/2020"
data
df <- structure(list(date = "22-Jul-2020"), class = "data.frame", row.names = "1")
You can try
library(lubridate)
df <- data.frame(date = c("22-Jul-2020"))
df$date <- dmy(df$date)
df$date <- format(df$date, format = "%d/%m/%Y")
# date
#1 22/07/2020

How to set timestamp as an index for a data frame in R

I have a dataframe called 'trial'. I have combined the Date and Time in the data frame to get a field which has timestamp as a POSIXct. I want to set this combined date time or the timestampas the index for my data frame 'trial' how can I do so? I have seen similar questions on this with no success.
The code is as follows:
trial <- read.csv("2018_05_04_h093500.csv", header=TRUE, skip = 16, sep=",")
trial$Date <- with(trial, as.POSIXct(paste(as.Date(Date, format="%Y/%m/%d"), Time)))
dtPOSIXct <- as.POSIXct(trial$Date )
dtTime <- as.numeric(dtPOSIXct - trunc(dtPOSIXct, "days"))
class(dtTime) <- "POSIXct"
If you're talking about rownames which is the R equivalent to index in pandas, they can't be POSIXct datetimes, they have to be characters.
# sample data
x <- data.frame('a' = 1:3, 'b' = c('a', 'b', 'c'))
print(x)
# a b
#2018-01-01 15:51:33 1 a
#2018-01-04 11:42:31 2 b
#2018-01-07 22:04:41 3 c
dates <- c('2018-01-01 15:51:33', '2018-01-04 11:42:31', '2018-01-07 22:04:41')
rownames(x) <- as.POSIXct(dates, format = '%Y-%m-%d %H:%M:%S')
print(class(rownames(x)[1]))
# [1] "character"
That said, you can still coerce them to POSIXct (or any other class, obviously) at the time of evaluation at the cost of some overhead and code clutteredness:
# print x where rownames, when coerced to POSIXct, represent dates after d
d <- as.POSIXct('2018-01-03 00:00:00', '%Y-%m-%d %H:%M:%S')
x[as.POSIXct(rownames(x), format = f) > d, ]
# a b
#2018-01-04 11:42:31 2 b
#2018-01-07 22:04:41 3 c
However, perhaps an easier approach would be to just have an arbitrary column effectively act as a datetime index:
x$date <- as.POSIXct(dates, format = '%Y-%m-%d %H:%M:%S')
class(x$date[1])
# [1] "POSIXct" "POSIXt"

Changing date format (NA error)

So I have this data file which includes dates and other values. I've imported my data using the following code:
df <- read.csv(file.choose(), header=T, stringsAsFactors=F)
This is so that all the values in the data frame are in character. This makes the next step easier for me.
The data.frame (df) includes:
date x
20020102 1
20020102 2
The date changes every few thousand rows.
I want to change the date format so that it would be yyyy-mm-dd.
I've tried using the code:
df$date <- as.Date(df$date, format="%Y-%m-%d")
and have also used
df$date <- strptime(df$date, format="%Y-%m-%d")
but have always gotten NA values in the date column.
I'm a beginner at R so it would be very helpful if the solution could be simple or can be explained clearly.
Thanks so much!
You can use the correct format
df$date <- as.Date(df$date, format='%Y%m%d')
It is not clear whether you have numeric or non-numeric 'date' column. If it is 'numeric', convert to 'character' first
df$date <- as.Date(as.character(df$date), format='%Y%m%d')
But, strptime would work even if the column is numeric.
Or using library(lubridate)
library(lubridate)
ymd(df$date)
The problem is that your colunm "date" is not of class 'Date', it is a 'numeric' vector, thus the command as.Date returns NA`s.
You can check if the class of the colunm date is correct with this command:
class(df$date)
Following the advise from #akrun you should transform the date colunm into a 'character' vector, then you can format the style the way you want:
### your data example:
df <- data.frame(date = c(20020102, 20020102),
x = c(1,2))
class(df$date)
#> [1] "numeric"
#convert the colunm date to character
df$date <- as.character(df$date)
# Then, convert to the desired date format:
df$date <- as.Date(df$date, format='%Y%m%d')
# check the results:
df
#> date x
#> 1 2002-01-02 1
#> 2 2002-01-02 2
class(df$date)
#> [1] "Date"

R - How to subset a table between two specific dates?

I have hourly data values of eight years, and I would like to subset all the values within an specific year. For example a data set for 2007, another for 2008 and so on. At the moment I have many problems with the date format, because when I specific a time period, I get another date period.
Here is my table: LValley, and that is what I have tried:
LValley <- read.table("C:/LValley.txt", header=TRUE, dec = ",", sep="\t")
year2007 <- subset(LValley, date > as.Date("01.01.2007 01:00", "%d.%m.%Y %H:%M") & date < as.Date("01.02.2008 01:00", "%d.%m.%Y %H:%M"))
but it returns me another date period, and I would like exactly all the data from 2007.
I have used also the function of this example, and I have the same results # Subset a dataframe between 2 dates
mydatefunc <- function(x,y){LValley[LValley$date >= x & LValley$date <= y,]}
DATE1 <- as.Date("01.01.2007 01:00", "%d.%m.%Y %H:%M")
DATE2 <- as.Date("01.01.2008 00:00", "%d.%m.%Y %H:%M")
Test2007 <- mydatefunc(DATE1,DATE2)
I will appreciate very much you help,
Kind regards,
Darwin
You need to convert the date column in the file to date class. For example:
LValley <- read.table("LValley.txt", header=TRUE,dec=",", sep="\t", stringsAsFactors=FALSE)
date1 <- as.Date(LValley$date, "%d.%m.%Y %H:%M")
Test2007 <- subset(LValley, date1>=DATE1 & date1 <=DATE2)
dim(Test2007)
#[1] 6249 4

subtracting dates with standardised result

I am subtracting dates in xts i.e.
library(xts)
# make data
x <- data.frame(x = 1:4,
BDate = c("1/1/2000 12:00","2/1/2000 12:00","3/1/2000 12:00","4/1/2000 12:00"),
CDate = c("2/1/2000 12:00","3/1/2000 12:00","4/1/2000 12:00","9/1/2000 12:00"),
ADate = c("3/1/2000","4/1/2000","5/1/2000","10/1/2000"),
stringsAsFactors = FALSE)
x$ADate <- as.POSIXct(x$ADate, format = "%d/%m/%Y")
# object we will use
xxts <- xts(x[, 1:3], order.by= x[, 4] )
#### The subtractions
# anwser in days
transform(xxts, lag = as.POSIXct(BDate, format = "%d/%m/%Y %H:%M") - index(xxts))
# asnwer in hours
transform(xxts, lag = as.POSIXct(CDate, format = "%d/%m/%Y %H:%M") - index(xxts))
Question: How can I standardise the result so that I always get the answer in hours. Not by multiplying the days by 24 as I will not know before han whther the subtratcion will round to days or hours....
Unless I can somehow check if the format is in days perhaps using grep and regexand then multiply within an if clause.
I have tried to work through this and went for the grep regex apprach but this doesnt even keep the negative sign..
p <- transform(xxts, lag = as.POSIXct(BDate, format = "%d/%m/%Y %H:%M") - index(xxts))
library(stringr)
ind <- grep("days", p$lag)
p$lag[ind] <- as.numeric( str_extract_all(p$lag[ind], "\\(?[0-9,.]+\\)?")) * 24
p$lag
#2000-01-03 2000-01-04 2000-01-05 2000-01-10
# 36 36 36 132
I am convinced there is a more elegant solution...
ok difftime works...
transform(xxts, lag = difftime(as.POSIXct(BDate, format = "%d/%m/%Y %H:%M"), index(xxts), unit = "hours"))

Resources