R - How to subset a table between two specific dates? - r

I have hourly data values of eight years, and I would like to subset all the values within an specific year. For example a data set for 2007, another for 2008 and so on. At the moment I have many problems with the date format, because when I specific a time period, I get another date period.
Here is my table: LValley, and that is what I have tried:
LValley <- read.table("C:/LValley.txt", header=TRUE, dec = ",", sep="\t")
year2007 <- subset(LValley, date > as.Date("01.01.2007 01:00", "%d.%m.%Y %H:%M") & date < as.Date("01.02.2008 01:00", "%d.%m.%Y %H:%M"))
but it returns me another date period, and I would like exactly all the data from 2007.
I have used also the function of this example, and I have the same results # Subset a dataframe between 2 dates
mydatefunc <- function(x,y){LValley[LValley$date >= x & LValley$date <= y,]}
DATE1 <- as.Date("01.01.2007 01:00", "%d.%m.%Y %H:%M")
DATE2 <- as.Date("01.01.2008 00:00", "%d.%m.%Y %H:%M")
Test2007 <- mydatefunc(DATE1,DATE2)
I will appreciate very much you help,
Kind regards,
Darwin

You need to convert the date column in the file to date class. For example:
LValley <- read.table("LValley.txt", header=TRUE,dec=",", sep="\t", stringsAsFactors=FALSE)
date1 <- as.Date(LValley$date, "%d.%m.%Y %H:%M")
Test2007 <- subset(LValley, date1>=DATE1 & date1 <=DATE2)
dim(Test2007)
#[1] 6249 4

Related

Changing date formats from MM-YYYY to DD-MM-YYYY by creating random DD

I have a dataset that has a variable date_of_birth (MM-YYYY). I would like to change this format to DD-MM-YYYY by creating random DD for each observation.
df1 <- as.Date(paste0(df,"01/",MMYYYY),format="%d-%m-%Y")
dates <- c("02-1986", "03-1990")
add_random_day <- function(date) {
date <- lubridate::as_date(date, format="%m-%Y")
days_in_month <- lubridate::days_in_month(date)
random_day <- sapply(days_in_month, sample, size = 1)
lubridate::day(date) <- random_day
date
}
add_random_day(dates)

How to convert factors to date format

Id. Int. Date.
234. 10. 10-05-2018
345. 05. 15-05-2018
564. 04. 17-06-2018
DF <- read.csv(file)
str(df)
I found date is in factor, so I want another column next to Date column with those date but in date format.
df$dte <- as.Date(df$Date, format= "%d/%b/%Y")
But I got a column next to Date called "date"column but the values are <NA>.
Kindly help me.
Changing df$dte <- as.Date(df$Date, format= "%d/%b/%Y") to df$dte <- as.Date(df$Date, format= "%d-%m-%Y") .

Convert Date formats in base R [duplicate]

This question already has answers here:
Changing date format in R
(7 answers)
Closed 3 years ago.
Given two dates in a data frame that are in this format:
df <- tibble(date = c('25/05/95', '21/09/18'))
df$date <- as.Date(df$date)
How can I convert the dates into this format - date = c('1995-05-25', '2018-09-21') with the year appearing first and in four digit format, and by only using base R?
Here is my attempt, I successfully reversed the order, but still wasn't able to express the year in 4 digit format:
df <- tibble(date_orig = c('25/05/1995', '21/09/2018'))
df$date <- as.Date(df$date_orig)
year_date <- format(df$date, '%d')
month_date <- format(df$date, '%m')
day_date <- format(df$date, '%y')
df$newdate <- as.Date(paste(paste(year_date, month_date, sep = '-'), day_date, sep = '-'))
df$newdate_final <- as.Date(df$newdate, '%Y-%m-%d')
You need to know which format your date follows and find it in ?strptime to convert it in date object. As you required output is the standard way to represent dates you would not need format.
as.Date(df$date, "%d/%m/%Y")
#[1] "1995-05-25" "2018-09-21"

Adding time to day speficied dates in R

I want to add hours to day specified dates. And I want the output to be in date format. I wrote the below code:
day<-as.Date(c("20-01-2016", "21-01-2016", "22-01-2016", "23-01-2016"),format="%d-%m-%Y")
hour<-c("12:00:00")
date<-as.Date(paste(day,hour), format="%d-%m-%Y %h:%m:%s")
However, This code produces NA's:
> date
[1] NA NA NA NA
How can I do this in R? I will be very glad for any help. Thanks a lot.
The below code also doesn't work:
day<-as.Date(c("20-01-2016", "21-01-2016", "22-01-2016", "23-01-2016"),format="%d-%m-%Y")
time <- "12:00:00"
x <- paste(day, time)
x1<-as.POSIXct(x, format = "%d-%m-%Y %H:%M:%S")
It still prodeces NAs:
> x1
[1] NA NA NA NA
You can do either of these two:
dates <- as.Date(c("20-01-2016", "21-01-2016", "22-01-2016", "23-01-2016"), format = "%d-%m-%Y")
time <- "12:00:00"
x <- paste(dates, time)
as.POSIXct(x, format = "%Y-%m-%d %H:%M:%S")
dates <- c("20-01-2016", "21-01-2016", "22-01-2016", "23-01-2016")
time <- "12:00:00"
x <- paste(dates, time)
as.POSIXct(x, format = "%d-%m-%Y %H:%M:%S")
I personally find the second version simpler.

Subset time series by groups based on cutoff date data frame

I have a data frame with time series data for several different groups. I want to apply different start and end cutoff dates to each group within the original data frame.
Here's a sample data frame:
date <- seq(as.POSIXct("2014-07-21 17:00:00", tz= "GMT"), as.POSIXct("2014-09-11 24:00:00", tz= "GMT"), by="hour")
group <- letters[1:4]
datereps <- rep(date, length(group))
attr(datereps, "tzone") <- "GMT"
sitereps <- rep(group, each = length(date))
value <- rnorm(length(datereps))
df <- data.frame(DateTime = datereps, Group = group, Value = value)
and here's the data frame 'cut' of cutoff dates to use:
start <- c("2014-08-01 00:00:00 GMT", "2014-07-26 00:00:00 GMT", "2014-07-21 17:00:00 GMT", "2014-08-03 24:00:00 GMT")
end <- c("2014-09-11 24:00:00 GMT", "2014-09-01 24:00:00 GMT", "2014-09-07 24:00:00 GMT", "2014-09-11 24:00:00 GMT")
cut <- data.frame(Group = group, Start = as.POSIXct(start), End = as.POSIXct(end))
I can do it manually for each group, getting rid of the data I don't want on both ends of the time series using ![(),]:
df2 <- df[!(df$Group == "a" & df$DateTime > "2014-08-01 00:00:00 GMT" & df$DateTime < "2014-09-11 24:00:00 GMT"),]
But, how can I automate this?
Just merge the cuts into the data frame, and then create a new data frame using the new columns, like below. df3 contains the removed records, df4 contains the retained ones.
df2 <- merge(x = df,y = cut,by = "Group")
df3 <- df2[df2$DateTime <= df2$Start | df2$DateTime >= df2$End,]
df4 <- df2[!(df2$DateTime <= df2$Start | df2$DateTime >= df2$End),]

Resources