I would like to convert my Date column into real date format. I cannot convert with as.Date. Could you please help me? I tried with following. But it doesn't work.
1)as.POSIXct(as.character(df1$Date, format="%Y%m%d"))
2)as.Date(as.character(Date),"%Y/%m%/d").
> head(df1)
Date sim obs
2 43091.25 313.62295499999999 314.39
3 43093.25 313.60034200000001 314.43
4 43094.25 313.608948 314.31
5 43095.25 313.56323200000003 314.24
6 43096.25 313.52330000000001 314.2
7 43097.25 313.47250000000003 314.29000000000002
> str(df1)
data.frame': 700 obs. of 3 variables:
$ Date: chr "43091.25" "43093.25" "43094.25" "43095.25" ...
$ sim : chr "313.62295499999999" "313.60034200000001" "313.608948"
"313.56323200000003" ...
$ obs : chr "314.39" "314.43" "314.31" "314.24" ...
There is no need to use external packages, just specify the origin Excel uses:
Date <- c("43091.25", "43093.25", "43094.25", "43095.25", "43096.25", "43097.25")
as.Date(as.numeric(Date), origin = "1899-12-30")
# [1] "2017-12-22" "2017-12-24" "2017-12-25" "2017-12-26" "2017-12-27" "2017-12-28"
Dates with a 43ddd.dd pattern usually come from Excel.
You can use openxlsx :
openxlsx::convertToDateTime(as.numeric("43091.25"))
[1] "2017-12-22 06:00:00 CET"
Waldi's answer is great and openxlsx is great. As an alternative for anyone coming across this in the future, you can also use janitor:
df <- tribble(~Date, ~sim, ~obs,
43091.25, 313.6222956, 314.39,
43093.25, 313.60034200000001, 314.43,
43094.25, 313.608948, 314.31)
df %>%
mutate(Date = janitor::excel_numeric_to_date(Date))
Produces:
Date sim obs
<date> <dbl> <dbl>
1 2017-12-22 314. 314.
2 2017-12-24 314. 314.
3 2017-12-25 314. 314.
Related
I have imported a CSV containing dates in the column "Activity_Date_Minute". The date value for example is "04/12/2016 01:12:00". Now when I read the .csv into a dataframe and extract only the date this gives me date in the column as 4-12-20. Can someone help how to get the date in mm-dd-yyyy in a separate column?
Tried the below code. Was expecting to see a column with dates e.g 04/12/2016 (mm/dd/yyyy).
#Installing packages
install.packages("tidyverse")
library(tidyverse)
install.packages('ggplot2')
library(ggplot2)
install.packages("dplyr")
library(dplyr)
install.packages("lubridate")
library(lubridate)
##Installing packages
install.packages("tidyverse")
library(tidyverse)
install.packages('ggplot2')
library(ggplot2)
install.packages("dplyr")
library(dplyr)
install.packages("lubridate")
library(lubridate)
##Reading minute-wise METs into "minutewiseMET_Records" and summarizing MET per day for all the IDs
minutewiseMET_Records <- read.csv("minuteMETsNarrow_merged.csv")
str(minutewiseMET_Records)
## converting column ID to character,Activity_Date_Minute to date
minutewiseMET_Records$Id <- as.character(minutewiseMET_Records$Id)
minutewiseMET_Records$Date <- as.Date(minutewiseMET_Records$Activity_Date_Minute)
str(minutewiseMET_Records)
The Console is as follows:
> minutewiseMET_Records <- read.csv("minuteMETsNarrow_merged.csv")
> str(minutewiseMET_Records)
'data.frame': 1048575 obs. of 3 variables:
$ Id : num 1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
$ Activity_Date_Minute: chr "04/12/2016 00:00" "04/12/2016 00:01" "04/12/2016 00:02" "04/12/2016 00:03" ...
$ METs : int 10 10 10 10 10 12 12 12 12 12 ...
> ## converting column ID to character,Activity_Date_Minute to date
> minutewiseMET_Records$Id <- as.character(minutewiseMET_Records$Id)
> minutewiseMET_Records$Date <- as.Date(minutewiseMET_Records$Activity_Date_Minute)
> ## converting column ID to character,Activity_Date_Minute to date
> minutewiseMET_Records$Id <- as.character(minutewiseMET_Records$Id)
> minutewiseMET_Records$Date <- as.Date(minutewiseMET_Records$Activity_Date_Minute)
> str(minutewiseMET_Records)
'data.frame': 1048575 obs. of 4 variables:
$ Id : chr "1503960366" "1503960366" "1503960366" "1503960366" ...
$ Activity_Date_Minute: chr "04/12/2016 00:00" "04/12/2016 00:01" "04/12/2016 00:02" "04/12/2016 00:03" ...
$ METs : int 10 10 10 10 10 12 12 12 12 12 ...
$ Date : Date, format: "4-12-20" "4-12-20" ...
>
I think this will work for you
minutewiseMET_Records$Date <- format(as.Date(minutewiseMET_Records$Activity_Date_Minute, format = "%d/%m/%Y"),"%m/%d/%Y")
Fist of all you have to tell R the format of your initial data. Then, you ask it which is the format you want for the output.
Activity_Date_Minute isn’t a datetime in your initial data, it’s a character. So you’ll have to first convert it to a datetime (e.g., using lubridate::mdy_hm()), then use as.Date().
library(dplyr)
library(lubridate)
minutewiseMET_Records %>%
mutate(
Activity_Date_Minute = mdy_hm(Activity_Date_Minute),
Activity_Date = as.Date(Activity_Date_Minute)
)
# A tibble: 4 × 2
Activity_Date_Minute Activity_Date
<dttm> <date>
1 2016-04-12 00:00:00 2016-04-12
2 2016-04-12 00:01:00 2016-04-12
3 2016-04-12 00:02:00 2016-04-12
4 2016-04-12 00:03:00 2016-04-12
I am trying to read dates from different excel files and each of them have the dates stored in different formats (character or date). This is making the date column on each file being read as character "28/02/2020" or as the numeric conversion excel does to the dates "452344" (number of days since 1900)
files1 = list.files(pattern="*.xlsx")
df = lapply(files1, read_excel,col_types = "text")
df = do.call(rbind, df)
¿How can I make R to read the character type "28/02/2020" and not the "452344" numeric type?
For multiple date format in one column I suggest using lubridate::parse_date_time() (or any other date converter that converts ambiguous format to NA instead of printing an error message)
I assume your df should look something like this:
# A tibble: 6 x 2
id date
<chr> <chr>
1 1 43889
2 2 43889
3 3 43889
4 1 28/02/2020
5 2 28/02/2020
6 3 28/02/2020
Then you should use this code:
library(lubridate)
df <- as.data.frame(df)
df$date2 <- parse_date_time(x = df$date, orders = "d m y") #converts rows like "28/02/2020" to date
df[is.na(df$date2),"date2"] <- as.Date(as.numeric(df[is.na(df$date2),"date"]), origin = "1899-12-30") #converts rows like "43889"
R output:
id date date2
1 1 43889 2020-02-28
2 2 43889 2020-02-28
3 3 43889 2020-02-28
4 1 28/02/2020 2020-02-28
5 2 28/02/2020 2020-02-28
6 3 28/02/2020 2020-02-28
str(df)
'data.frame': 6 obs. of 3 variables:
$ id : chr "1" "2" "3" "1" ...
$ date : chr "43889" "43889" "43889" "28/02/2020" ...
$ date2: POSIXct, format: "2020-02-28" "2020-02-28" "2020-02-28" "2020-02-28" ...
I know it is not the nicest solution but it should work for you as well
I'm trying to convert a column of character to time. The column has observations in only minutes and seconds.
The dataset I try it on is this:
data <- data.frame(
time = c("1:40", "1:55", "3:00", "2:16"),
stringsAsFactors = FALSE
)
data
time
1 1:40
2 1:55
3 3:00
4 2:16
I have checked other questions about converting character to time on here, but haven't found a solution to my problem.
The output I want is this:
time
1 00:01:40
2 00:01:55
3 00:03:00
4 00:02:16
strptime + as.character formatting will give you the expected result. But realise that it is a character value.
data$time <- as.character(strptime(data$time, "%M:%S"), "%H:%M:%S")
data
time
1 00:01:40
2 00:01:55
3 00:03:00
4 00:02:16
Convert to chron times class and if you want a character column format it.
library(chron)
transform(data, time = times(paste0("00:", time)))
transform(data, time = format(times(paste0("00:", time))))
hms is a time class that happens to have the print method you're asking for:
data <- data.frame(
time = c("1:40", "1:55", "3:00", "2:16"),
stringsAsFactors = FALSE
)
data$time <- hms::as.hms(paste0('00:', data$time))
data
#> time
#> 1 00:01:40
#> 2 00:01:55
#> 3 00:03:00
#> 4 00:02:16
str(data)
#> 'data.frame': 4 obs. of 1 variable:
#> $ time: 'hms' num 00:01:40 00:01:55 00:03:00 00:02:16
#> ..- attr(*, "units")= chr "secs"
You can convert to hms in other ways, if you like, e.g. by parsing with as.POSIXct or strptime and then coercing with as.hms.
[enter image description here][1][enter image description here][2]I have a data frame "RH", with hourly data and I want to convert it to daily maximum and minimum data. This code was very useful [question]:Aggregating hourly data into daily aggregates
RH$Date <- strptime(RH$Date,format="%y/%m/%d)
RH$day <- trunc(RH$Date,"day")
require(plyr)
x <- ddply(RH,.(Date),
summarize,
aveRH=mean(RH),
maxRH=max(RH),
minRH=min(RH)
)
But my first 5 years data are 3 hours data not hourly. so no results for those years. Any suggestion? Thank you in advance.
'data.frame': 201600 obs. of 3 variables:
$ Date: chr "1985/01/01" "1985/01/01" "1985/01/01" "1985/01/01" ...
$ Hour: int 1 2 3 4 5 6 7 8 9 10 ...
$ RH : int NA NA 93 NA NA NA NA NA 79 NA ...
The link you provided is an old one. The code is still perfectly good and would work, but here's a more modern version using dplyr and lubridate
df <- read.table(text='date_time value
"01/01/2000 01:00" 30
"01/01/2000 02:00" 31
"01/01/2000 03:00" 33
"12/31/2000 23:00" 25',header=TRUE,stringsAsFactors=FALSE)
library(dplyr);library(lubridate)
df %>%
mutate(date_time=as.POSIXct(date_time,format="%m/%d/%Y %H:%M")) %>%
group_by(date(date_time)) %>%
summarise(mean=mean(value,na.rm=TRUE),max=max(value,na.rm=TRUE),
min=min(value,na.rm=TRUE))
`date(date_time)` mean max min
<date> <dbl> <dbl> <dbl>
1 2000-01-01 31.33333 33 30
2 2000-12-31 25.00000 25 25
EDIT
Since there's already a date column, this should work:
RH %>%
group_by(Date) %>%
summarise(mean=mean(RH,na.rm=TRUE),max=max(RH,na.rm=TRUE),
min=min(RH,na.rm=TRUE))
I know this have been asked several times but I could not find the right way to get around my problem. I have a very simple CSV file that I upload, looking like:
27.07.2015,100
28.07.2015,100.1504
29.07.2015,100.1957
30.07.2015,100.5044
31.07.2015,100.7661
03.08.2015,100.9308
04.08.2015,100.8114
05.08.2015,100.6927
06.08.2015,100.7501
07.08.2015,100.7194
10.08.2015,100.8197
11.08.2015,100.8133
Now I need to convert my data.frame into xts so I can use the PerformanceAnalytics package. My data.frame has the structure:
> str(mpey)
'data.frame': 243 obs. of 2 variables:
$ V1: Factor w/ 243 levels "01.01.2016","01.02.2016",..: 210 218 228 234 241 21 30 38 45 52 ...
- attr(*, "names")= chr "5" "6" "7" "8" ...
$ V2: Factor w/ 242 levels "100","100.0062",..: 1 4 5 10 16 20 17 13 15 14 ...
- attr(*, "names")= chr "5" "6" "7" "8" ...
I tried different things with as.xts function but could make it work.
Could you please help me get over this?
Here's a solution using the tidyquant package, which contains as_xts() for coercing data frames to xts objects and as_tibble() for coercing time series objects such as xts to tibbles ("tidy" data frames).
Recreate your data
> data_df
# A tibble: 12 × 2
date value
<fctr> <fctr>
1 27.07.2015 100
2 28.07.2015 100.1504
3 29.07.2015 100.1957
4 30.07.2015 100.5044
5 31.07.2015 100.7661
6 03.08.2015 100.9308
7 04.08.2015 100.8114
8 05.08.2015 100.6927
9 06.08.2015 100.7501
10 07.08.2015 100.7194
11 10.08.2015 100.8197
12 11.08.2015 100.8133
First, we need to reformat your data frame. The dates and values are both stored as factors and they need to be in a date and double class, respectively. We'll load tidyquant and reformat the data frame. Note that tidyquant loads the tidyverse and financial packages so you don't need to load anything else. The date can be converted with lubridate::dmy which converts characters in a day-month-year format to date. The value needs to go from factor to character then from character to double, and this is done by nesting as.numeric and as.character.
> library(tidyquant)
> data_tib <- data_df %>%
mutate(date = dmy(date),
value = as.numeric(as.character(value)))
> data_tib
# A tibble: 12 × 2
date value
<date> <dbl>
1 2015-07-27 100.0000
2 2015-07-28 100.1504
3 2015-07-29 100.1957
4 2015-07-30 100.5044
5 2015-07-31 100.7661
6 2015-08-03 100.9308
7 2015-08-04 100.8114
8 2015-08-05 100.6927
9 2015-08-06 100.7501
10 2015-08-07 100.7194
11 2015-08-10 100.8197
12 2015-08-11 100.8133
Now, we can coerce to xts using the tidyquant::as_xts() function. Just specify date_col = date.
> data_xts <- data_tib %>%
as_xts(date_col = date)
> data_xts
value
2015-07-27 100.0000
2015-07-28 100.1504
2015-07-29 100.1957
2015-07-30 100.5044
2015-07-31 100.7661
2015-08-03 100.9308
2015-08-04 100.8114
2015-08-05 100.6927
2015-08-06 100.7501
2015-08-07 100.7194
2015-08-10 100.8197
2015-08-11 100.8133