Date time values are displayed broken in DT:datatable() - r

I have the dataframe below and I want to display it as DT::datatable()
library(DT)
library(lubridate)
eventex<-structure(list(registration_type = c("Start", "Stopp"), timestamp = c("25.11.2022 13:19:42",
"31.10.2022 14:19:13")), row.names = c(NA, -2L), class = c("tbl_df",
"tbl", "data.frame"))
eventex$timestamp<-as.POSIXct(paste0(eventex$timestamp), format="%d.%m.%Y %H:%M:%S")
datatable(eventex)
why is it displayed like this?

People's definition of "normal" may vary, especially when ambiguous (and irrational) conventions such as commonly used in the US are assumed. There is a formatDate function that can be applied to your datetime column. (The options are not well documented in the R package):
DT:::DateMethods
[1] "toDateString" "toISOString" "toLocaleDateString" "toLocaleString" "toLocaleTimeString"
[6] "toString" "toTimeString" "toUTCString"
datatable(eventex) %>% formatDate(~timestamp, "toLocaleString" )
Further information and worked examples can be found here: https://rstudio.github.io/DT/functions.html

Related

Creating a function to change a variable type to time

I'm playing around with functions in R and want to create a function that takes a character variable and converts it to a POSIXct.
The time variable currently looks like this:
"2020-01-01T05:00:00.283236Z"
I've successfully converted the time variable in my janviews dataset with the following code:
janviews$time <- gsub('T',' ',janviews$time)
janviews$time <- as.POSIXct(janviews$time, format = "%Y-%m-%d %H:%M:%S", tz = Sys.timezone())
Since I have to perform this on multiple datasets, I want to create a function that will perform this. I created the following function but it doesn't seem to be working and I'm not sure why:
set.time <- function(dat, variable.name){
dat$variable.name <- gsub('T', ' ', dat$variable.name)
dat$variable.name <- as.POSIXct(dat$variable.name, format = "%Y-%m-%d %H:%M:%S", tz = Sys.timezone())
}
Here's the first four rows of the janviews dataset:
structure(list(customer_id = c("S4PpjV8AgTBx", "p5bpA9itlILN",
"nujcp24ULuxD", "cFV46KwexXoE"), product_id = c("kq4dNGB9NzwbwmiE",
"FQjLaJ4B76h0l1dM", "pCl1B4XF0iRBUuGt", "e5DN2VOdpiH1Cqg3"),
time = c("2020-01-01T05:00:00.283236Z", "2020-01-01T05:00:00.895876Z",
"2020-01-01T05:00:01.362329Z", "2020-01-01T05:00:01.873054Z"
)), row.names = c(NA, -4L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x1488180e0>)
Also, if there is a better way to convert my time variable, I am open to changing my method!
I would use the lubridate package and the as_datetime() function.
lubridate::as_datetime("2020-01-01T05:00:00.283236Z")
Returns
"2020-01-01 05:00:00 UTC"
Lubridate Info

Converting a count down timer to a usable format

I am trying to convert a count down timer from a character into a usable format, ideally a time format but numeric may work.
I have tried converting it using, as.POSIXct and also using the Chron package
Here is a dput of the DF
structure(list(Time = c("(-01:30)", "(-01:15)", "(-01:00)", "(-00:45)",
"(-00:30)", "(-00:15)", "0", "+00:13", "+00:15", "+00:30", "+00:45"
)), row.names = c(NA, -11L), class = c("tbl_df", "tbl", "data.frame"
))
I have already removed the brackets from the time column using
sd$Time = (gsub("[(),//]", "", sd$Time))
Then tried ton convert using the following
sd$Time <- as.POSIXct(sd$Time, format="%M:%S")
An option would be strptime
strptime(sub("^0$", "00:00", gsub("[-+()]", "", sd$Time)), format = "%M:%S")

Spread dataframe

I have the following dataframe/tibble sample:
structure(list(name = c("Contents.Key", "Contents.LastModified",
"Contents.ETag", "Contents.Size", "Contents.Owner", "Contents.StorageClass",
"Contents.Bucket", "Contents.Key", "Contents.LastModified", "Contents.ETag"
), value = c("2019/01/01/07/556662_cba3a4fc-cb8f-4150-859f-5f21a38373d0_0e94e664-4d5e-4646-b2b9-1937398cfaed_2019-01-01-07-54-46-064",
"2019-01-01T07:54:47.000Z", "\"378d04496cb27d93e1c37e1511a79ec7\"",
"24187", "e7c0d260939d15d18866126da3376642e2d4497f18ed762b608ed2307778bdf1",
"STANDARD", "vfevvv-edrfvevevev-streamed-data", "2019/01/01/07/556662_cba3a4fc-cb8f-4150-859f-5f21a38373d0_33a8ba28-245c-490b-99b2-254507431d47_2019-01-01-07-54-56-755",
"2019-01-01T07:54:57.000Z", "\"df8cc7082e0cc991aa24542e2576277b\""
)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"
))
I want to spread the names column using tidyr::spread() function but I don't get the desired result
df %>% tidyr::spread(key = name, value = value)
I get an error:
Error: Duplicate identifiers for rows:...
Also tried with melt function same result.
I have connected to S3 using aws.s3::get_bucket() function and trying to convert it to dataframe. I am aware there is a aws.s3::get_bucket_df() function which should do this but it doesn't work (you may look at my relevant question.
After I've got the bucket list, I've unlisted it and run enframe command.
Please advise.
You can introduce a new column first(introduces NAs, will have to deal with them).
df %>%
mutate(RN=row_number()) %>%
group_by(RN) %>%
spread(name,value)

arrange with period object (ms function) doesn't work - R

I have a time recorded in following format mm:ss where the minutes values can actually be greater than 59. I have parsed it as chr. Now I need to sort my observations in a descending order so I firstly created a time2 variable with ms function and used arrange on the new variable. However arranging doesn't work and the values in the second column are totally mixed up.
library(tidyverse)
library(lubridate)
test <- structure(list(time = c("00:58", "07:30", "08:07", "15:45", "19:30",
"24:30", "30:05", "35:46", "42:23", "45:30", "48:08", "52:01",
"63:45", "67:42", "80:12", "86:36", "87:51", "04:27", "09:34",
"12:33", "18:03", "20:28", "21:39", "23:31", "24:02", "26:28",
"31:13", "43:03", "44:00", "45:38")), .Names = "time", row.names = c(NA,
-30L), class = c("tbl_df", "tbl", "data.frame"))
test %>% mutate(time2 = ms(time)) %>% arrange(time2) %>% View()
How can I arrange this times?
I think it would be easier to just put time in te same unit and then arrange(). Try this:
test %>% mutate(time_in_seconds = minute(ms(time) )*60 + second(ms(time))) %>%
arrange(desc(time_in_seconds)) %>%
View()
seconds_to_period(minute(ms(test$time))*60 + second(ms(test$time))) # to get right format (with hours)
This is a known limitation with arrange. dplyr does not support S4 objects: https://github.com/tidyverse/lubridate/issues/515

Convert date to month/year format for time series

I have some have some water quality sample data.
> dput(GrowingArealog90s[1:10,])
structure(list(SampleDate = structure(c(6948, 6949, 6950, 7516,
7517, 7782, 7783, 7784, 8092, 8106), class = "Date"), Flog90 = c(1.51851393987789,
1.48970743802793, 1.81243963000062, 0.273575501327576, 0.874218895695207,
1.89762709129044, 1.44012088794774, 0.301029995663981, 1.23603370361931,
0.301029995663981)), .Names = c("SampleDate", "Flog90"), class = c("tbl_df",
"data.frame"), row.names = c(NA, -10L))
This data is collected monthly, although some months are missed over the 25 year period.
I know there is so much help out there for converting dates to different formats but I have not been able to figure this out. I want to create a time series with just a month/year format, so that I can do things like decompose the data by month and run seasonal kendalls and such. I have tried so many different ways of converting my date to the desired format that I have completely confused myself. I don't care about the exact format as long as it is recognized month/year.
I also need to fill in the missing months with NAs.
I tried uploading the "SampleDate" column in a numeric format, "yyyymm". I could then merge that data frame with another that contained all the dates I need.
GA90 <- merge(Dates, GrowingArealog90s, by.x = "Date", by.y = "Date", all.x = TRUE)
However, when I converted the resulting data frame to a time series it would not recognize the 12 month frequency.
GA90ts <- as.ts(GA90, frequency(12))
> GA90ts
Time Series:
Start = 1
End = 324
Frequency = 1
Any help with this is appreciated.
Here's how to do it with zoo. You'll get a warning, but it's OK for now. You'll get a series with mon/yy.
series <-structure(list(SampleDate = structure(c(6948, 6949, 6950, 7516,
7517, 7782, 7783, 7784, 8092, 8106), class = "Date"), Flog90 = c(1.51851393987789,
1.48970743802793, 1.81243963000062, 0.273575501327576, 0.874218895695207,
1.89762709129044, 1.44012088794774, 0.301029995663981, 1.23603370361931,
0.301029995663981)), .Names = c("SampleDate", "Flog90"), class = c("tbl_df",
"data.frame"), row.names = c(NA, -10L))
library(zoo)
series <-as.data.frame(series) #to drop dplyr class
series.zoo <-zoo(series[,-1,drop=FALSE],as.yearmon(series[,1]))
Best practice would be to keep your series with actual date and use as.yearmon or as.yearmon only when you actually need to make calculations or aggregate.zoo by month and year.
The following is a matter of taste, but I've dealt with a lot of time series and I think zoo is superior to ts and xts. Much more flexible.
Now, to fill in missing values, you have to create a vector of dates. Here, I'm using a zoo object with actual dates. I then use na.locf, which is "last observation carry forward". You could also look at na.approx.
series.zoo <-zoo(series[,-1,drop=FALSE],(series[,1]))
my.seq <-seq.Date(first(series[,1,drop=FALSE]), last(series[,1,drop=FALSE]),by="month")
merged <-merge.zoo(series.zoo,zoo(,my.seq))
na.locf(merged)
UPDATE
With aggregate.
GrowingArealog90s <-structure(list(SampleDate = structure(c(6948, 6949, 6950, 7516,
7517, 7782, 7783, 7784, 8092, 8106), class = "Date"), Flog90 = c(1.51851393987789,
1.48970743802793, 1.81243963000062, 0.273575501327576, 0.874218895695207,
1.89762709129044, 1.44012088794774, 0.301029995663981, 1.23603370361931,
0.301029995663981)), .Names = c("SampleDate", "Flog90"), class = c("tbl_df",
"data.frame"), row.names = c(NA, -10L))
library(zoo);library(xts)
GrowingArealog90s <-as.data.frame(GrowingArealog90s) #to remove dplyr format
GrowingArealog90s.zoo <-zoo(GrowingArealog90s[,-1,drop=FALSE],as.Date(GrowingArealog90s[,1]))
#First aggregate by month. I chose to get the mean per month
GrowingArealog90s.agg <-aggregate(GrowingArealog90s.zoo, as.yearmon, mean) #replace mean with last to get last reading of the month
#Then create a sequence of months and merge it
my.seq <-seq.Date(first(GrowingArealog90s[,1]), last(GrowingArealog90s[,1]),by="month")
merged <-merge.zoo(GrowingArealog90s.agg ,zoo(,as.yearmon(my.seq)))
na.locf(merged)

Resources