Subsetting dates in R - r

I have the following data
dat <- structure(list(Datetime = structure(c(1261987200, 1261987500,
1261987800, 1261988100, 1261988400), class = c("POSIXct", "POSIXt"
), tzone = ""), Rain = c(0, -999, -999, -999, -999)), row.names = c(NA,
5L), class = "data.frame")
The first column contains the dates (year, month, day, hour). The second column is Rainfall.
The dates are not continuous. Some of the dates with missing Rainfall were already removed.
I would like to ask what is the best way of subsetting this data in terms of Year, Day, month or hour?
For example, I just want to get all data for July (month = 7). What I do is something like this:
dat$month<-substr(dat$Datetime,6,7)
july<-dat[which(dat$month == 7),]
or if its a year, say 2010:
dat$year<-substr(dat$Datetime,1,4)
dat<-which(dat$year == 2010),]
Then convert them into numeric types.
Is there an easier way to do this in R? the dates are already formatted using POSIXlt.
I'll appreciate any help on this.
Lyndz

If you want to convert the Datetime to year or month (numeric), you can try format like below
df1 <- transform(
df,
year = as.numeric(format(Datetime,"%Y")),
month = as.numeric(format(Datetime,"%m"))
)
which gives
Datetime Rain year month
1 2009-12-28 09:00:00 0 2009 12
2 2009-12-28 09:05:00 -999 2009 12
3 2009-12-28 09:10:00 -999 2009 12
4 2009-12-28 09:15:00 -999 2009 12
5 2009-12-28 09:20:00 -999 2009 12
If you want to subset df1 further by year (for example, year == 2010), then
subset(
df1,
year == 2010
)

You can use the lubridate package and associated month and year functions.
library(tidyverse)
library(lubridate)
df <- structure(list(
Datetime = structure(
c(1261987200, 1261987500,
1261987800, 1261988100, 1261988400),
class = c("POSIXct", "POSIXt"),
tzone = ""
),
Rain = c(0,-999,-999,-999,-999)
),
row.names = c(NA,
5L),
class = "data.frame") %>%
as_tibble()
df %>%
mutate(month = lubridate::month(Datetime),
year = lubridate::year(Datetime))
Output:
# A tibble: 5 x 4
Datetime Rain month year
<dttm> <dbl> <dbl> <dbl>
1 2009-12-28 16:00:00 0 12 2009
2 2009-12-28 16:05:00 -999 12 2009
3 2009-12-28 16:10:00 -999 12 2009
4 2009-12-28 16:15:00 -999 12 2009
5 2009-12-28 16:20:00 -999 12 2009

Related

How do I combine year month date time into a single datetime column using DBI in R?

Here's an example of my source data csv, df:
Year : int 2005 2005 2005 2005 2005 2005 2005 2005 2005 2005 ...
Month : int 1 1 1 1 1 1 1 1 1 1 ...
DayofMonth : int 28 29 30 31 2 3 4 5 6 7 ...
Time : int 1605 1605 1610 1605 1900 1900 1900 1900 1900 1900
How can I combine the 4 columns into a single column into a datetime column in my database using SQLite and not dplyr?
I hope the output can be something like: 2005-01-29 09:30:00 so that I can plot graphs.
You could use lubridate::make_datetime:
Year =c(2005, 2005, 2005)
Month = c(1,2,3)
DayofMonth = c( 28, 28, 30)
Time = c(1605, 1605, 1610)
lubridate::make_datetime(Year,Month,DayofMonth,floor(Time/100),Time%%100)
[1] "2005-01-28 16:05:00 UTC" "2005-02-28 16:05:00 UTC" "2005-03-30 16:10:00 UTC"
Base R, using Waldi's sample data (thanks!):
with(df, as.POSIXct(sprintf("%i-%02i-%i %02i:%02i:00",
Year, Month, DayofMonth, Time %/% 100, Time %% 100))
)
# [1] "2005-01-28 16:05:00 EST" "2005-02-28 16:05:00 EST" "2005-03-30 16:10:00 EST"
This is (possibly naïvely) assuming that you never have seconds or fractional minutes or Time values that are not "meaningful" (i.e., more than 59 minutes, more than 23 hours).
Data
df <- structure(list(Year = c(2005, 2005, 2005), Month = c(1, 2, 3), DayofMonth = c(28, 28, 30), Time = c(1605, 1605, 1610)), class = "data.frame", row.names = c(NA, -3L))

extract year and month from character date field R

I have a column in my large data set called Date. How do I extract both the year and month from it? I would like to create a column Month where the month goes from 1-12 and year where the year goes from the first year in my data set to the last year in my data set.
Thanks.
> typeof(data$Date)
[1] "character
> head(data$Date)
[1] "2/06/2020 11:23" "12/06/2020 7:56" "12/06/2020 7:56" "29/06/2020 16:54" "3/06/2020 15:09" "25/06/2020 17:11"
dplyr and lubridate -
library(dplyr)
library(lubridate)
data <- data %>%
mutate(Date = dmy_hm(Date),
month = month(Date),
year = year(Date))
# Date month year
#1 2020-06-02 11:23:00 6 2020
#2 2020-06-12 07:56:00 6 2020
#3 2020-06-12 07:56:00 6 2020
#4 2020-06-29 16:54:00 6 2020
#5 2020-06-03 15:09:00 6 2020
#6 2020-06-25 17:11:00 6 2020
Base R -
data$Date <- as.POSIXct(data$Date, tz = 'UTC', format = '%d/%m/%Y %H:%M')
data <- transform(data, Month = format(Date, '%m'), Year = format(Date, '%Y'))
data
data <- structure(list(Date = c("2/06/2020 11:23", "12/06/2020 7:56",
"12/06/2020 7:56", "29/06/2020 16:54", "3/06/2020 15:09", "25/06/2020 17:11"
)), class = "data.frame", row.names = c(NA, -6L))

Convert decimal month and year into Date

I have a 'decimal month' and a year variable:
df <- data.frame(decimal_month = c(4.75, 5, 5.25), year = c(2011, 2011, 2011))
How can I convert these variables to a Date? ("2011-04-22" "2011-05-01" "2011-05-08"). Or at least to day of the year.
You may use some nice functions from the zoo package:
as.yearmon to convert year and floor of the decimal month to class yearmon.
Then use as.Date.yearmon and its frac argument to coerce the year-month to class Date.
library(zoo)
df$date = as.Date(as.yearmon(paste(df$year, floor(df$decimal_month), sep = "-")),
frac = df$decimal_month - floor(df$decimal_month))
# decimal_month year date
# 1 4.75 2011 2011-04-22
# 2 5.00 2011 2011-05-01
# 3 5.25 2011 2011-05-08
If desired, day of year is simply format(df$date, "%j")

separating data with respect to month, day, year and hour in R

I have two columns in a data frame first is water consumption and the second column is for date+hour. for example
Value Time
12.2 1/1/2016 1:00
11.2 1/1/2016 2:00
10.2 1/1/2016 3:00
The data is for 4 years and I want to create separate columns for month date year and hour.
I would appreciate any help
We can convert to Datetime and then extract the components. We assume the format of 'Time' column is 'dd/mm/yyyy H:M' (in case it is different i.e. 'mm/dd/yyyy H:M', change the dmy_hm to mdy_hm)
library(dplyr)
library(lubridate)
df1 %>%
mutate(Time = dmy_hm(Time), month = month(Time),
year = year(Time), hour = hour(Time))
# Value Time month year hour
#1 12.2 2016-01-01 01:00:00 1 2016 1
#2 11.2 2016-01-01 02:00:00 1 2016 2
#3 10.2 2016-01-01 03:00:00 1 2016 3
In base R, we can either use strptime or as.POSIXct and then use either format or extract components
df1$Time <- strptime(df1$Time, "%d/%m/%Y %H:%M")
transform(df1, month = Time$mon+1, year = Time$year + 1900, hour = Time$hour)
# Value Time month year hour
#1 12.2 2016-01-01 01:00:00 1 2016 1
#2 11.2 2016-01-01 02:00:00 1 2016 2
#3 10.2 2016-01-01 03:00:00 1 2016 3
data
df1 <- structure(list(Value = c(12.2, 11.2, 10.2), Time = c("1/1/2016 1:00",
"1/1/2016 2:00", "1/1/2016 3:00")), class = "data.frame", row.names = c(NA,
-3L))

R: Order by Date (by Year, by Month) [duplicate]

This question already has an answer here:
Sort year-month column by year AND month
(1 answer)
Closed 1 year ago.
I have dates in the format mm/yyyy in column 1, and then results in column 2.
month Result
01/2018 96.13636
02/2018 96.40000
3/2018 94.00000
04/2018 97.92857
05/2018 95.75000
11/2017 98.66667
12/2017 97.78947
How can I order by month such that it will start from the first month (11/2017) and end (05/2018).
I have tried a few 'orders', but none seem to be ordering by year and then by month
In tidyverse (w/ lubridate added):
library(tidyverse)
library(lubridate)
dfYrMon <-
df1 %>%
mutate(date = parse_date_time(month, "my"),
year = year(date),
month = month(date)
) %>%
arrange(year, month) %>%
select(date, year, month, result)
With data:
df1 <- tibble(month = c("01/2018", "02/2018", "03/2018", "04/2018", "05/2018", "11/2017", "12/2017"),
result = c(96.13636, 96.4, 94, 97.92857, 95.75, 98.66667, 97.78947))
Will get you this 'dataframe':
# A tibble: 7 x 4
date year month result
<dttm> <dbl> <dbl> <dbl>
1 2017-11-01 2017 11 98.66667
2 2017-12-01 2017 12 97.78947
3 2018-01-01 2018 1 96.13636
4 2018-02-01 2018 2 96.40000
5 2018-03-01 2018 3 94.00000
6 2018-04-01 2018 4 97.92857
7 2018-05-01 2018 5 95.75000
Making your data values atomic (year in its own column, month in its own column) generally improves the ease of manipulation.
Or if you want to use base R date manipulations instead of lubridate's:
library(tidyverse)
dfYrMon_base <-
df1 %>%
mutate(date = as.Date(paste("01/", month, sep = ""), "%d/%m/%Y"),
year = format(as.Date(date, format="%d/%m/%Y"),"%Y"),
month = format(as.Date(date, format="%d/%m/%Y"),"%m")
) %>%
arrange(year, month) %>%
select(date, year, month, result)
dfYrMon_base
Note the datatypes created.
# A tibble: 7 x 4
date year month result
<date> <chr> <chr> <dbl>
1 2017-11-01 2017 11 98.66667
2 2017-12-01 2017 12 97.78947
3 2018-01-01 2018 01 96.13636
4 2018-02-01 2018 02 96.40000
5 2018-03-01 2018 03 94.00000
6 2018-04-01 2018 04 97.92857
7 2018-05-01 2018 05 95.75000
We can convert it to yearmon class and then do the order
library(zoo)
out <- df1[order(as.yearmon(df1$month, "%m/%Y"), df1$Result),]
row.names(out) <- NULL
out
# month Result
#1 11/2017 98.66667
#2 12/2017 97.78947
#3 01/2018 96.13636
#4 02/2018 96.40000
#5 03/2018 94.00000
#6 04/2018 97.92857
#7 05/2018 95.75000
data
df1 <- structure(list(month = c("01/2018", "02/2018", "03/2018", "04/2018",
"05/2018", "11/2017", "12/2017"), Result = c(96.13636, 96.4,
94, 97.92857, 95.75, 98.66667, 97.78947)), .Names = c("month",
"Result"), class = "data.frame",
row.names = c("1", "2", "3",
"4", "5", "6", "7"))

Resources