connecting 2 columns in R isnot working - r

I try to read data from a file which holds date and time and wrote the following code to concatenate the coulms Date and Time into 1 colum named Datetime:
df <-read.csv("file", header=TRUE)
df = data.frame(DateTime=as.POSIXct(paste(df$Date, df$Time)), df)
The problem is that the output holds only the Date and not the Time.
I also tried to change the format of the data with df$Date <- as.Date(df$Date , "%y/%m/%d") but the output is NA.
Please advice.
The file sample is here:
Date,Time
2011/12/22,02:00:00
2011/12/22,02:01:00
2011/12/22,02:02:00
2011/12/22,02:03:00
2011/12/22,02:04:00
2011/12/22,02:05:00
2011/12/22,02:06:00
2011/12/22,02:07:00
2011/12/22,02:08:00
2011/12/22,02:09:00
2011/12/22,02:10:00
2011/12/22,02:11:00
2011/12/22,02:12:00
2011/12/22,02:13:00
2011/12/22,02:14:00
2011/12/22,02:15:00
2011/12/22,02:16:00
2011/12/22,02:17:00
2011/12/22,02:18:00
2011/12/22,02:19:00
2011/12/22,02:20:00

Try
df$datetime <- as.POSIXct(paste(df$Date, df$Time), format="%Y/%m/%d %H:%M:%S")
df$Date <- as.Date(df$Date, "%Y/%m/%d")
head(df,3)
# ( Date Time datetime
#1 2011-12-22 02:00:00 2011-12-22 02:00:00
#2 2011-12-22 02:01:00 2011-12-22 02:01:00
#3 2011-12-22 02:02:00 2011-12-22 02:02:00
str(df)
#'data.frame': 21 obs. of 3 variables:
#$ Date : Date, format: "2011-12-22" "2011-12-22" ...
#$ Time : chr "02:00:00" "02:01:00" "02:02:00" "02:03:00" ...
#$ datetime: POSIXct, format: "2011-12-22 02:00:00" "2011-12-22 02:01:00" ...

Related

How to create a date (column) from a date-time (column) in R

I have imported a CSV containing dates in the column "Activity_Date_Minute". The date value for example is "04/12/2016 01:12:00". Now when I read the .csv into a dataframe and extract only the date this gives me date in the column as 4-12-20. Can someone help how to get the date in mm-dd-yyyy in a separate column?
Tried the below code. Was expecting to see a column with dates e.g 04/12/2016 (mm/dd/yyyy).
#Installing packages
install.packages("tidyverse")
library(tidyverse)
install.packages('ggplot2')
library(ggplot2)
install.packages("dplyr")
library(dplyr)
install.packages("lubridate")
library(lubridate)
##Installing packages
install.packages("tidyverse")
library(tidyverse)
install.packages('ggplot2')
library(ggplot2)
install.packages("dplyr")
library(dplyr)
install.packages("lubridate")
library(lubridate)
##Reading minute-wise METs into "minutewiseMET_Records" and summarizing MET per day for all the IDs
minutewiseMET_Records <- read.csv("minuteMETsNarrow_merged.csv")
str(minutewiseMET_Records)
## converting column ID to character,Activity_Date_Minute to date
minutewiseMET_Records$Id <- as.character(minutewiseMET_Records$Id)
minutewiseMET_Records$Date <- as.Date(minutewiseMET_Records$Activity_Date_Minute)
str(minutewiseMET_Records)
The Console is as follows:
> minutewiseMET_Records <- read.csv("minuteMETsNarrow_merged.csv")
> str(minutewiseMET_Records)
'data.frame': 1048575 obs. of 3 variables:
$ Id : num 1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
$ Activity_Date_Minute: chr "04/12/2016 00:00" "04/12/2016 00:01" "04/12/2016 00:02" "04/12/2016 00:03" ...
$ METs : int 10 10 10 10 10 12 12 12 12 12 ...
> ## converting column ID to character,Activity_Date_Minute to date
> minutewiseMET_Records$Id <- as.character(minutewiseMET_Records$Id)
> minutewiseMET_Records$Date <- as.Date(minutewiseMET_Records$Activity_Date_Minute)
> ## converting column ID to character,Activity_Date_Minute to date
> minutewiseMET_Records$Id <- as.character(minutewiseMET_Records$Id)
> minutewiseMET_Records$Date <- as.Date(minutewiseMET_Records$Activity_Date_Minute)
> str(minutewiseMET_Records)
'data.frame': 1048575 obs. of 4 variables:
$ Id : chr "1503960366" "1503960366" "1503960366" "1503960366" ...
$ Activity_Date_Minute: chr "04/12/2016 00:00" "04/12/2016 00:01" "04/12/2016 00:02" "04/12/2016 00:03" ...
$ METs : int 10 10 10 10 10 12 12 12 12 12 ...
$ Date : Date, format: "4-12-20" "4-12-20" ...
>
I think this will work for you
minutewiseMET_Records$Date <- format(as.Date(minutewiseMET_Records$Activity_Date_Minute, format = "%d/%m/%Y"),"%m/%d/%Y")
Fist of all you have to tell R the format of your initial data. Then, you ask it which is the format you want for the output.
Activity_Date_Minute isn’t a datetime in your initial data, it’s a character. So you’ll have to first convert it to a datetime (e.g., using lubridate::mdy_hm()), then use as.Date().
library(dplyr)
library(lubridate)
minutewiseMET_Records %>%
mutate(
Activity_Date_Minute = mdy_hm(Activity_Date_Minute),
Activity_Date = as.Date(Activity_Date_Minute)
)
# A tibble: 4 × 2
Activity_Date_Minute Activity_Date
<dttm> <date>
1 2016-04-12 00:00:00 2016-04-12
2 2016-04-12 00:01:00 2016-04-12
3 2016-04-12 00:02:00 2016-04-12
4 2016-04-12 00:03:00 2016-04-12

Converting a column in R to type datetime: as.POSIXct returns NA

I am struggling while trying to convert a column of type char to datetime.
Date.Time Lat Lon Base
<chr> <dbl> <dbl> <chr>
4/1/2014 0:11:00 40.7690 -73.9549 B02512
4/1/2014 0:17:00 40.7267 -74.0345 B02512
4/1/2014 0:21:00 40.7316 -73.9873 B02512
⋮ ⋮ ⋮ ⋮
4/30/2014 23:31:00 40.7443 -73.9889 B02764
4/30/2014 23:32:00 40.6756 -73.9405 B02764
4/30/2014 23:48:00 40.6880 -73.9608 B02764
Using mutate:
df <- mutate(df, Date.Time = as.POSIXct(Date.Time,
format = "%M/%D%/%Y% %H:%M%S"))
While the type for Date.Time has changed from to , the whole column is just NAs.
Output of Sys.getlocale():
'LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=C;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=C;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C'
Your input formatting codes are not correct. Should be "%m/%d/%Y %H:%M:%S" instead of "%M/%D%/%Y% %H:%M%S". Everything else is fine:
library(dplyr)
df <- read.table(header = TRUE, text = '
Date.Time Lat Lon Base
"4/1/2014 0:11:00" 40.7690 -73.9549 B02512
"4/1/2014 0:17:00" 40.7267 -74.0345 B02512
"4/1/2014 0:21:00" 40.7316 -73.9873 B02512
"4/30/2014 23:31:00" 40.7443 -73.9889 B02764
"4/30/2014 23:32:00" 40.6756 -73.9405 B02764
"4/30/2014 23:48:00" 40.6880 -73.9608 B02764')
df2 <- mutate(df, Date.Time = as.POSIXct(Date.Time,
format = "%m/%d/%Y %H:%M:%S"))
str(df2)
# 'data.frame': 6 obs. of 4 variables:
# $ Date.Time: POSIXct, format: "2014-04-01 00:11:00" "2014-04-01 00:17:00" "2014-04-01 00:21:00" ...
# $ Lat : num 40.8 40.7 40.7 40.7 40.7 ...
# $ Lon : num -74 -74 -74 -74 -73.9 ...
# $ Base : chr "B02512" "B02512" "B02512" "B02764" ...
You can use the library anytime to do this. Here is an example -
date_var = "4/30/2014"
library(anytime)
anytime::anydate(date_var)
results in
> "2014-04-30"
An option with lubridate
library(dplyr)
library(lubridate)
df %>%
mutate(Date.Time = mdy_hms(Date.Time))
Actually, your attempt is quite correct, only the format was wrong.
as.POSIXct("4/1/2014 0:21:00", format = "%M/%D%/%Y% %H:%M%S")
#> [1] NA
If you remove the extra % after year and day, add the missing : between minutes and seconds and make the codes for year and month a lower case letter (because %M stands for minutes and %D is short for %m/%d/%y::
as.POSIXct("4/1/2014 0:21:00", format = "%m/%d/%Y %H:%M:%S")
#> [1] "2014-04-01 00:21:00 CEST"
Created on 2021-02-14 by the reprex package (v1.0.0)

How to convert a column of character type of minutes and seconds to time?

I'm trying to convert a column of character to time. The column has observations in only minutes and seconds.
The dataset I try it on is this:
data <- data.frame(
time = c("1:40", "1:55", "3:00", "2:16"),
stringsAsFactors = FALSE
)
data
time
1 1:40
2 1:55
3 3:00
4 2:16
I have checked other questions about converting character to time on here, but haven't found a solution to my problem.
The output I want is this:
time
1 00:01:40
2 00:01:55
3 00:03:00
4 00:02:16
strptime + as.character formatting will give you the expected result. But realise that it is a character value.
data$time <- as.character(strptime(data$time, "%M:%S"), "%H:%M:%S")
data
time
1 00:01:40
2 00:01:55
3 00:03:00
4 00:02:16
Convert to chron times class and if you want a character column format it.
library(chron)
transform(data, time = times(paste0("00:", time)))
transform(data, time = format(times(paste0("00:", time))))
hms is a time class that happens to have the print method you're asking for:
data <- data.frame(
time = c("1:40", "1:55", "3:00", "2:16"),
stringsAsFactors = FALSE
)
data$time <- hms::as.hms(paste0('00:', data$time))
data
#> time
#> 1 00:01:40
#> 2 00:01:55
#> 3 00:03:00
#> 4 00:02:16
str(data)
#> 'data.frame': 4 obs. of 1 variable:
#> $ time: 'hms' num 00:01:40 00:01:55 00:03:00 00:02:16
#> ..- attr(*, "units")= chr "secs"
You can convert to hms in other ways, if you like, e.g. by parsing with as.POSIXct or strptime and then coercing with as.hms.

Dividing time series by datetime period?

I have some problems splitting a datetime variable into two variables. My time series is a /hour/day/month count of a full year (about 360 days).
I would like to generate one variable that ranges from the 1st of each month to the 19th of each month and the second variable captures 20 to the rest of the month:
Format:
datetime hours var1 var2 var3
2011-01-1 00:00:00
2011-01-1 01:00:00
... ...
2011-01-1 23:00:00
2011-01-2 00:00:00
2011-01-2 01:00:00
... ...
2011-01-2 23:00:00
... ...
... ...
2011-01-20 01:00:00
...
2011-01-31 00:00:00
... ...
2011-12-30 00:00:00
2011-12-30 01:00:00
.. ..
Desired Format:
datetime1 datetime2 var1 var2 var2
2011-01-1 00:00:00 2011-01-20 00:00:00
2011-01-1 01:00:00 2011-01-21 01:00:00
.. .. .. ..
2011-01-19 00:00:00 2011-01-30 00:00:00
2011-01-19 01:00:00 2011-01-30 01:00:00
.. .. .. ...
.. .. .. ...
2011-12-19 00:00:00 2011-12-30 00:00:00
2011-12-19 01:00:00 2011-12-30 01:00:00
Originally, I was able to produce the datetime variable by:
rbind or rbind.fill ( plyr) two data frames with datetime1 and datetime2
df3<-rbind(df1,df2)
That is, the original version (two data frames) had these two variables but Im not able to seperate them now.
I just couldt formulate the code...
Try this to extract the day of the date:
xx <- as.Date("2014-12-31")
as.POSIXlt(xx)$mday
You can then use the date as a condition to attribute NA to one column and value to the other.
EDIT: Here's the more in-depth version.
#Setting up a replicable example
mydata <- as.data.frame(matrix(rnorm(90), ncol=3))
names(mydata)[1:2] <- paste0("time",1:2)
mydata$time1 <- as.Date(NA)
mydata$time2 <- as.Date(NA)
str(mydata)
#Getting 30 consecutive days:
datestring <- rep(Sys.time(), 30)
for(i in 1:30)
datestring[i]<- Sys.time() + 60*60*24*i
mydata <- cbind(datestring, mydata)
#Doing what I think you're trying to do:
for (i in 1:dim(mydata)[1]){
if(as.POSIXlt(mydata$datestring[i])$mday <=19)
{mydata$time1[i] <- mydata$datestring[i]}
else {mydata$time2[i] <- mydata$datestring[i]}
}

Generating date data frame in R

I want to generate a data frame containing dates based on the date that I choose at the beginning as ReportDate:
ReportDate <- as.Date("2014-01-01")
date <- data.frame(matrix(nrow=60, ncol=1))
for(i in 1:60){
date[i,1] = as.Date(ReportDate+i-1, origin="%Y-%m-%d")
}
but it gives me numeric values as output not date value. Please kindly tell me how I can solve this problem.
You can just add integers to Date class objects directly:
ReportDate <- as.Date("2014-01-01")
DateDf <- data.frame(
date=ReportDate+(0:59))
##
> head(DateDf)
date
1 2014-01-01
2 2014-01-02
3 2014-01-03
4 2014-01-04
5 2014-01-05
6 2014-01-06
> str(DateDf)
'data.frame': 60 obs. of 1 variable:
$ date: Date, format: "2014-01-01" "2014-01-02" "2014-01-03" ...

Resources