I have a column that I have already changed to DATE using as.date
> test_subset$Date <- as.Date(test_subset$Date, "%d/%m/%Y")
The next column holds numeric data as the times corresponding to adjacent column of dates. The problem I have is it keeps inserting the current date into this time only column - I specify column "TIME"
> test_subset[[2]] <- strptime(test_subset[[2]], "%H:%M:%S")
but the result appends the current date to the time.
How do I adjust the "TIME" column to NOT include the current date.
How can I strip out the current date from the "TIME" Column
The date is being inserted because you're using strptime(). You can use format(), since you don't necessarily need a special class for the times. If we have an example vector x,
x <- Sys.time() + 0:5
#[1] "2015-03-05 07:08:27 PST" "2015-03-05 07:08:28 PST"
#[3] "2015-03-05 07:08:29 PST" "2015-03-05 07:08:30 PST"
#[5] "2015-03-05 07:08:31 PST" "2015-03-05 07:08:32 PST"
To get the date only, you can use
as.Date(format(x, "%F"))
# [1] "2015-03-05" "2015-03-05" "2015-03-05" "2015-03-05"
# [5] "2015-03-05" "2015-03-05"
And to get the time only, use format() again, but with %T.
format(x, "%T")
# [1] "07:08:27" "07:08:28" "07:08:29" "07:08:30" "07:08:31"
# [6] "07:08:32"
If it turns out you do need a time class on the second column, you can use chron::times() which gives a "times" class.
(time <- chron::times(format(x, "%T")))
# [1] 07:08:27 07:08:28 07:08:29 07:08:30 07:08:31 07:08:32
class(time)
# [1] "times"
Related
I have a data frame with the amount of time it takes to do a lap and I'm trying to separate that into individual data frames for each driver.
These time values look like this, being in minutes:seconds.milliseconds, except for the first lap which has a Colon in between seconds and milliseconds.
13:14:50 1:28.322 1:24.561 1:23.973 1:23.733 1:24.752
I'd like to have these in a separate data frame in a seconds format like this.
794.500 88.322 84.561 83.973 83.733 84.752
When I convert this to a numeric it gives the following values.
214 201 174 150 133 183
And when I use strptime or POSIXlt it gives me huge values which are also wrong, even when I use the format codes. However, I subtracted 2 values to find that the time difference was correct, and through that I found that were all off by 1609164020. Also, these values ignore the decimal values which I need.
You can use POSIXlt in conjunction with a conversion to seconds.
First, add a date to your first time element:
ds <- c("13:14:50", "1:28.322", "1:24.561", "1:23.973", "1:23.733", "1:24.752")
ds[1] <- paste( Sys.Date(), ds[1] )
#[1] "2020-12-29 13:14:50" "1:28.322" "1:24.561"
#[4] "1:23.973" "1:23.733" "1:24.752"
Create a function to convert the subsequent minutes:seconds.milliseconds to seconds.milliseconds:
to_sec <- function(x){ as.numeric(sub( ":.*","", x )) * 60 +
as.numeric( sub( ".*:","", x ) ) }
Convert the vector to dates that enable calculation of time differences:
ds[2:6] <- to_sec(ds[2:6])
ds[2:6] <- cumsum(ds[2:6])
dv <- c( as.POSIXlt(ds[1]), as.POSIXlt(ds[1]) + as.numeric(ds[2:6]) )
# [1] "2020-12-29 13:14:50 CET" "2020-12-29 13:16:18 CET"
# [3] "2020-12-29 13:17:42 CET" "2020-12-29 13:19:06 CET"
# [5] "2020-12-29 13:20:30 CET" "2020-12-29 13:21:55 CET"
dv[6] - dv[1]
# Time difference of 7.089017 mins
I would like to change only the year format on a POSIX date-time value. I would like to change 2013-12-30 XX:XX:XX to 2012-12-30 XX:XX:XX . I would like this to be general as there are hundreds of incidences with different hours. Is this possible to do while keeping the column as a POSIX value
1) Base R. Convert to POSIXlt, subtract one from the year component and convert back to POSIXct. No packages are used.
yearMinus <- function(x, n = 1) {
lt <- as.POSIXlt(x)
lt$year <- lt$year - n
as.POSIXct(lt)
}
# test
datetimes <- as.POSIXct( c("2013-12-30 03:02:01", "2013-12-30 03:02:01") )
yearMinus(datetimes)
## [1] "2012-12-30 03:02:01 EST" "2012-12-30 03:02:01 EST"
2) gsubfn Convert to character, match 4 digits, convert the match to numeric and subtract 1 (done in the second argument which represents the transformation in formula notation) and then convert back to POSIXct. This is done in one gsubfn call.
library(gsubfn)
as.POSIXct(gsubfn("\\d{4}", ~ as.numeric(year) - 1, as.character(datetimes)))
## [1] "2012-12-30 03:02:01 EST" "2012-12-30 03:02:01 EST"
If you want to subtract a year from the current timestamp
df$time - lubridate::years(1)
If you want to change only specific date without changing the time we can use sub
df$time <- as.POSIXct(sub("2013-12-30", "2012-12-30", df$time))
I have a date that I convert to a numeric value and want to convert back to a date afterwards.
Converting date to numeric:
date1 = as.POSIXct('2017-12-30 15:00:00')
date1_num = as.numeric(date1)
# 1514646000
Reconverting numeric to date:
as.Date(date1_num, origin = '1/1/1970')
# "4146960-12-12"
What am I missing with the reconversion? I'd expect the last command to return my original date1.
As the numeric vector is created from an object with time component, reconversion can also be in the same way i.e. first to POSIXct and then wrap with as.Date
as.Date(as.POSIXct(date1_num, origin = '1970-01-01'))
#[1] "2017-12-30"
You could use anytime() and anydate() from the anytime package:
R> pt <- anytime("2017-12-30 15:00:00")
R> pt
[1] "2017-12-30 15:00:00 CST"
R>
R> anydate(pt)
[1] "2017-12-30"
R>
R> as.numeric(pt)
[1] 1514667600
R>
R> anydate(as.numeric(pt))
[1] "2017-12-30"
R>
POSIXct counts the number of seconds since the Unix Epoch, while Date counts the number of days. So you can recover the date by dividing by (60*60*24) (let's ignore leap seconds), or convert back to POSIXct instead.
as.Date(as.numeric(date1)/(60*60*24), origin="1970-01-01")
[1] "2017-12-30"
as.POSIXct(as.numeric(date1),origin="1970-01-01")
[1] "2017-12-30 15:00:00 GMT"
Using lubridate :
lubridate::as_datetime(1514646000)
[1] "2017-12-30 15:00:00 UTC"
I have three data tables in R. Each one has a date column. The tables are vix_data,gold_ohlc_data,btc_ohlc_data. They are formatted as follows:
head(vix_data$Date)
[1] 1/2/04 1/5/04 1/6/04 1/7/04 1/8/04 1/9/04
3435 Levels: 1/10/05 1/10/06 1/10/07 1/10/08 1/10/11 ... 9/9/16
head(gold_ohlc_data$date)
[1] 8/23/17 8/22/17 8/21/17 8/18/17 8/17/17 8/16/17
2519 Levels: 1/10/08 1/10/11 1/10/12 1/10/13 1/10/14 ... 9/9/16
head(btc_ohlc_data$Date)
[1] "2017-08-23" "2017-08-22" "2017-08-21" "2017-08-20" "2017-08-19"
[6] "2017-08-18"
How can I change the date column in the vix_data and gold_ohlc_data tables to match the btc_ohlc_data format? I have tried several methods, for example using as.Date to transform each column- but this usually messes up the values and inserts a lot of N/A's
An option is to use functions from the package lubridate. The users need to know which one is day and which one is month to select the right function to use, such as dmy or mdy
# Load package
library(lubridate)
# Create example string
date1 <- c("1/2/04", "1/5/04", "1/6/04", "1/7/04", "1/8/04", "1/9/04")
date2 <- c("8/23/17", "8/22/17", "8/21/17", "8/18/17", "8/17/17", "8/16/17")
# Convert to date class
dmy(date1)
# [1] "2004-02-01" "2004-05-01" "2004-06-01" "2004-07-01" "2004-08-01" "2004-09-01"
mdy(date1)
# [1] "2004-01-02" "2004-01-05" "2004-01-06" "2004-01-07" "2004-01-08" "2004-01-09"
mdy(date2)
# [1] "2017-08-23" "2017-08-22" "2017-08-21" "2017-08-18" "2017-08-17" "2017-08-16"
Look into the package lubridate. lubridate::dmy() and ymd() should handle this just fine.
It looks like your data are read in as factors, so first you'll have to change them to characters. Then after that you can convert it to a date and specify the input format where %m represents the numerical month, %d represents the day, and %y represents the 2-digit year.
x <- c('1/2/04', '1/5/04', '1/6/04', '1/7/04', '1/8/04', '1/9/04')
y <- as.Date(x, format = "%m/%d/%y")
y
[1] "2004-01-02" "2004-01-05" "2004-01-06" "2004-01-07" "2004-01-08"
[6] "2004-01-09"
Are you sure you're specifying as.Date correctly? For example, do you have %y, instead of %Y?
I did the following and it worked:
> vix <- c("1/2/04", "1/5/04", "1/6/04", "1/7/04", "1/8/04", "1/9/04")
> vix<- as.factor(vix)
> vix
[1] 1/2/04 1/5/04 1/6/04 1/7/04 1/8/04 1/9/04
Levels: 1/2/04 1/5/04 1/6/04 1/7/04 1/8/04 1/9/04
> as.Date(vix, "%m/%d/%y")
[1] "2004-01-02" "2004-01-05" "2004-01-06" "2004-01-07" "2004-01-08" "2004-01-09"
I'm trying to do some manipulation on the POSIXct vector, but when I pass it to a function, the vector changes into a numeric vector, instead of retaining POSIXct class, even when the function itself only returns the object:
# Sample dates from vector and it's class.
> dates <- as.POSIXct(c("2012-02-01 12:32:00", "2012-10-24 17:25:56", "2008-09-26 17:13:31", "2011-08-23 11:11:17,", "2015-09-19 22:28:33"), tz = "America/Los_Angeles")
> dates
[1] "2012-02-01 12:32:00 PST" "2012-10-24 17:25:56 PDT" "2008-09-26 17:13:31 PDT" "2011-08-23 11:11:17 PDT" "2015-09-19 22:28:33 PDT"
> class(dates)
[1] "POSIXct" "POSIXt"
# Simple subset is retaining original class.
> qq <- dates[1:5]
> qq
[1] "2012-02-01 12:32:00 PST" "2012-10-24 17:25:56 PDT" "2008-09-26 17:13:31 PDT" "2011-08-23 11:11:17 PDT" "2015-09-19 22:28:33 PDT"
> class(qq)
[1] "POSIXct" "POSIXt"
# sapply on the same subset using simple "return" function changes class to "numeric" - why? How to retain "POSIXct"?
> qq2 <- sapply(dates[1:5], function(x) x)
> qq2
[1] 1328128320 1351124756 1222474411 1314123077 1442726913
> class(qq2)
[1] "numeric"
Why it happens? How can I retain the POSIXct class of the original vector? I know that POSIXct is numeric under the hood, but I want to retain the original class for readability.
We can use lapply instead of sapply as sapply by default has the option simplify = TRUE. So, if the list elements are of the same length, it will simplify it to vector or matrix depending on the length of the list elements and POSIXct is stored as numeric.
lst <- lapply(dates, function(x) x)
If we need to use sapply, then an option would simplify = FALSE
lst <- sapply(dates, function(x) x, simplify=FALSE)
After applying the function, if we need as a vector output,
do.call("c", lst)
Regarding the change of timezone, it is documented in the ?DateTimeClasses
Using c on "POSIXlt" objects converts them to the current time zone,
and on "POSIXct" objects drops any "tzone" attributes (even if they
are all marked with the same time zone).
So, the possible option would be (as mentioned in the comments by #kmo)
.POSIXct(lst, tz = "America/Los_Angeles")
#[1] "2012-02-01 12:32:00 PST" "2012-10-24 17:25:56 PDT" "2008-09-26 17:13:31 PDT" "2011-08-23 11:11:17 PDT" "2015-09-19 22:28:33 PDT"
Or as #thelatemail mentioned in the comments
.POSIXct(sapply(dates,I), attr(dates,"tzone") )