I currently have a large data set with the variable time formatted as such: "15:36:44.874541+0000", for one example.
I want to remove the decimals and just have it as "15:36:44" for every variable in the set.
You can convert it into POSIXct type and then extract only time with format.
x <- '15:36:44.874541+0000'
format(as.POSIXct(x, format = "%T"), "%T")
#[1] "15:36:44"
I don't think it's the perfect and most elegant way of doing it, but you can split by "." and keep the first item.
You can write:
c <- "15:36:44.874541+0000"
df <- data.frame(Time = rep(c,5))
library(rebus)
library(dplyr)
df %>% mutate(Time2 = unlist(strsplit(as.character(Time),"\\."))[1])
Time Time2
1 15:36:44.874541+0000 15:36:44
2 15:36:44.874541+0000 15:36:44
3 15:36:44.874541+0000 15:36:44
4 15:36:44.874541+0000 15:36:44
5 15:36:44.874541+0000 15:36:44
You just need to strip that time and format it.
current_time <- "15:36:44.874541+0000"
formated_time <- format(strptime(time, format = "%H:%M:%S"), "%H:%M:%S")
formated_time
[1] "15:36:44"
We can use strptime with strftime
strftime(strptime(x, "%T"), "%T")
#[1] "15:36:44"
data
x <- '15:36:44.874541+0000'
Related
In my data, Time and date is stored in a column 'cord' (class=factor). I want to separate the date and the time into two separate columns.
The data looks like this:
1 2019-05-26T13:50:56.335288Z
2 2019-05-26T17:55:45.348073Z
3 2019-05-26T18:12:00.882572Z
4 2019-05-26T18:26:49.577310Z
I have successfully extracted the date using:cord$Date <- as.POSIXct(cord$Time)
I have however not been able to find a way to extract the time in format "H:M:S".
The output of dput(head(cord$Time)) returns a long list of timestamps: "2020-04-02T13:34:07.746777Z", "2020-04-02T13:41:11.095014Z",
"2020-04-02T14:08:05.508818Z", "2020-04-02T14:17:10.337101Z", and so on...
Extract H:M:S
library(lubridate)
format(as_datetime(cord$Time), "%H:%M:%S")
#> [1] "13:50:56" "17:55:45" "18:12:00" "18:26:49"
If you need milliseconds too:
format(as_datetime(cord$Time), "%H:%M:%OS6")
#> [1] "13:50:56.335288" "17:55:45.348073" "18:12:00.882572" "18:26:49.577310"
where cord is:
cord <- read.table(text = " Time
1 2019-05-26T13:50:56.335288Z
2 2019-05-26T17:55:45.348073Z
3 2019-05-26T18:12:00.882572Z
4 2019-05-26T18:26:49.577310Z ", header = TRUE)
I typically use lubridate and data.table to do my date and manipulation work. This works for me copying in some of your raw dates as strings
library(lubridate)
library(data.table)
x <- c("2019-05-26T13:50:56.335288Z", "2019-05-26T17:55:45.348073Z")
# lubridate to parse to date time
y <- parse_date_time(x, "ymd HMS")
# data.table to split in to dates and time
split_y <- tstrsplit(y, " ")
dt <- as.data.table(split_y)
setnames(dt, "Date", "Time")
dt[]
# if you use data.frames instead
df <- as.data.frame(dt)
df
I have a dataframe with the column name perioden. This column contains the date but it is written in this format: 2010JJ00, 2011JJ00, 2012JJ00, 2013JJ00 etc..
This column is also a character when I look at the structure. I've tried multiple solutions but so far am still stuck, my qeustion is how can I convert this column to a date and how do I remove the JJ00 part so that you only see the year format of the column.
You can try this approach. Using gsub() to remove the non desired text (as said by #AllanCameron) and then format to date using paste0() to add the day and month, and as.Date() for date transformation:
#Data
df <- data.frame(Date=c('2010JJ00', '2011JJ00', '2012JJ00', '2013JJ00'),stringsAsFactors = F)
#Remove string
df$Date <- gsub('JJ00','',df$Date)
#Format to date, you will need a day and month
df$Date2 <- as.Date(paste0(df$Date,'-01-01'))
Output:
Date Date2
1 2010 2010-01-01
2 2011 2011-01-01
3 2012 2012-01-01
4 2013 2013-01-01
We can use ymd with truncated option
library(lubridate)
library(stringr)
ymd(str_remove(df$Date, 'JJ\\d+'), truncated = 2)
#[1] "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01"
data
df <- data.frame(Date=c('2010JJ00', '2011JJ00', '2012JJ00', '2013JJ00'), stringsAsFactors = FALSE)
I have this kind of data frame
dat <- read.table(text = " count date
0 10/05/2012
1 10/05/2013
1 10/05/2014
",header = TRUE)
I would like to have a new variable that will contain the following format:
> dat
count date DayMonth
1 0 10/05/2012 10-05
2 1 10/05/2013 10-05
3 1 10/05/2014 10-05
I tried some versions of strptime function like dat$DayMonth<-strptime(dat$date, "%d/%m") but got strange resluts.
How can I get the desired result
Same solution using the packages lubridate or anydate.
library(lubridate)
dat$DayMonth <- format(dmy(dat$date), "%d-%m")
# dmy stands for day,month,year, can you use ymd etc.
library(anytime)
dat$DayMonth <- format(anydate(dat$date), "%d-%m")
We can do this with as.Date and format
dat$DayMonth <- format(as.Date(dat$date, "%d/%m/%Y"), "%d-%m")
dat$DayMonth
#[1] "10-05" "10-05" "10-05"
Using strptime converts to POSIXlt/POSIXct class, from which we can change to the format using format
NOTE: No external packages used
I have a data frame containing dates as characters,dd.mm.yyyy format. want to convert those in date class, format yyyy-m-d. as.date() is not working returning error, do not know how to convert 'dates' to class “Date”
dates <- data.frame(cbind(c("5.1.2015", "6.1.2014", "17.2.2014", "28.10.2014")))
colnames(dates) <- c("dates")
as.Date(dates, format = "%Y-%m-%d")
new_format_dates <- cbind(gsub("[[:punct:]]", "", dates[1:nrow(dates),1]))
as.Date(new_format_dates, format = "%Y-%m-%d")
So I tried to replace the . and reformat those dates under new_format_dates, returning result like [1] NA NA NA NA
Firstly, make your data.frames properly; don't use cbind in data.frame. Next, set the format argument of as.Date to the format you've got, including separators. If you don't know the symbol you need, check out ?strptime.
dates <- data.frame(dates = c("5.1.2015", "6.1.2014", "17.2.2014", "28.10.2014"))
dates$dates_new <- as.Date(dates$dates, format = "%d.%m.%Y")
dates
# dates dates_new
# 1 5.1.2015 2015-01-05
# 2 6.1.2014 2014-01-06
# 3 17.2.2014 2014-02-17
# 4 28.10.2014 2014-10-28
dates <- data.frame(cbind(c("5.1.2015", "6.1.2014", "17.2.2014", "28.10.2014")))
colnames(dates) <- c("dates")
dates$new_Dates <- gsub("[.]","-",dates$dates)
dates$dates <- NULL
dates_new <- as.Date(dates$new_Dates, format = "%d-%m-%Y")
dates_new <- data.frame(dates_new)
print(dates_new)
If a have a vector such as the following:
REF_YEAR
1994-01-01
1995-01-01
1996-01-01
how can I delete the part "-01-01", so that I only get the year for the whole column?
If your vector is formatted as Dates, you can do:
x <- as.Date("2001-01-01")
format(x, "%Y")
#[1] "2001"
And for your example data:
# Your sample data:
df <- read.table(header=TRUE, text = "REF_YEAR
1994-01-01
1995-01-01
1996-01-01", stringsAsFactors = FALSE)
Convert your data to Date format:
df$REF_YEAR <- as.Date(df$REF_YEAR) # skip this step if it's already formatted as Date
Now convert to year format:
df$REF_YEAR <- format(df$REF_YEAR, "%Y")
Or
transform(df, REF_YEAR = format(REF_YEAR, "%Y"))
Result in both cases:
df
# REF_YEAR
#1 1994
#2 1995
#3 1996
You only need to make sure your data is in Date format (use as.Date() for conversion).
This can be done using regular expression. You can either keep the first four digit or eliminate the last six. Here is how you can do using the second option as asked by you.
ref_year = as.character("1994-01-01")
ref_year_only = substr(ref_year, 1, nchar(ref_year) - 6) ; ref_year_only
Also, please show some effort while asking questions on stack.
Without converting to Date, you could also try:
library(stringr)
df$YEAR <- str_extract(df$REF_YEAR, perl('\\d+(?=-)'))
df$YEAR
#[1] "1994" "1995" "1996"