Trouble filtering by date - r

I am summing up some values on a data frame, which works fine, until i apply a restriction with dates. I have tried declaring my "z" column of dates as follows:
Table$z <- as.Date(Table$createtime, "%m/%d/%Y",tz = "GMT")
and my End date as:
EndDate <- as.Date('2017-02-01')
These then are used in:
malone <- tbl_df(Table)
malone1 <- malone%>%
group_by(InstallationDate)%>%
summarize(Amount = sum(USAmount), count = n(), filter(z < EndDate))
Unfortunately i am not able to make this work and i am getting the following error:
Error in charToDate(x) :
character string is not in a standard unambiguous format

Related

How to use the as.POSIXlt() and solve the error of character string is not in a standard unambiguous format

I was working on an assignment,
library(tidyverse)
library(quantmod)
library(lubridate)
macro <- c("GDPC1", "CPIAUCSL","DTB3", "DGS10", "DAAA", "DBAA", "UNRATE", "INDPRO", "DCOILWTICO")
rm(macro_factors)
for (i in 1:length(macro)){
getSymbols(macro[i], src = "FRED")
data <- as.data.frame(get(macro[i]))
data$date <- as.POSIXlt.character(rownames(data))
rownames(data) <- NULL
colnames(data)[1] <- "macro_value"
data$quarter <- as.yearqtr(data$date)
data$macro_ticker <- rep(macro[i], dim(data)[1])
data <- data%>%
mutate(date = ymd(date))%>%
group_by(quarter)%>%
top_n(1,date) %>%
filter(date >= "1980-01-01", date <= "2019-12-31") %>%
if(i == 1){macro_factors <- data} else {macro_factors <- rbind(macro_factors, data)}
}
but this came out
Error in as.POSIXlt.character(rownames(data)) :
character string is not in a standard unambiguous format
I try follow the online tutorial of using as.POSIXct() by convert the data from charater to numeric first, but it did not work for my case, and I check the class of the data and the data shown like "year-month-day", and is in the class of character, supposedly the function as.POSIXlt() will work right?
There are several problems:
POSIXlt class should not be used in data frames. Also do not use POSIXct for dates since you can get into needless time zone problems.
to convert an xts object, such as the object produced by getSymbols , to a data frame use fortify.zoo
depending on what you want to do you might not need to convert from xts to a data frame in the first place. Suggest reading about xts and zoo in the documentation of those packages.
This gives a list of data frames L and then a long data frame DF containing them all.
library(dplyr, exclude = c("filter", "lag"))
library(quantmod) # also brings in xts and zoo
macro <- c("GDPC1", "CPIAUCSL")
getData <- function(symb) symb %>%
getSymbols(src = "FRED", auto.assign = FALSE) %>%
aggregate(as.yearqtr, tail, 1) %>%
window(start = "1980q1", end = "2019q4") %>%
fortify.zoo
L <- Map(getData, macro)
DF <- bind_rows(L, .id = "id")

Date format converting in R

I am trying to convert the values in column date (the data below) into date format yyyy-mm-dd:
I did it by using as.Date() and then change the output list into a dataframe as follows:
date_new = as.Date(df$Date, origin = '1899-12-30')
better_date = data.frame(Date = unlist(date_new))
I continue to use the converted data to filter June in each year and some other tasks as follows:
me_ff = df |>
filter(month(better_date) == 6) |>
mutate(sorting_date = better_date %m+% months(1)) |>
select(ticker,sorting_date,me_ff = Mkt_cap)
and the error message is:
"Error in `filter()`:
! Problem while computing `..1 = month(better_date) == 6`.
Caused by error in `as.POSIXlt.default()`:
! do not know how to convert 'x' to class “POSIXlt”
Run `rlang::last_error()` to see where the error occurred."
Could you please help me to solve the problem?
Thank you so much for your help!
Looks to me as if you confuse the data.frame object (the "table") with the date column. Untested code:
df <- transform(df, Date=as.Date(Date, origin = '1899-12-30')) |>
subset(strftime(Date, "%M")=="06")
should select the June rows.

How to run left join in dplyr transforming the key columns ( using lubridate function) on the fly

I have two databases where I need to combine columns based on 2 common Date columns, with condition that the DAY for those dates are the same.
"2020/01/01 20:30" MUST MATCH "2020/01//01 17:50"
All dates are in POSIXct format.
While I could use some pre-cprocessing with string parsing or the like, I wanted to handle it via lubridate/dplyr like:
DB_New <- left_join(DB_A,DB_B, by=c((date(Date1) = date(Date2)))
notice I am using the function "date" from dplyr to rightly match condition as explained above. I am though getting the error as below:
DB_with_rain <- left_join(DB_FEB_2019_join,Chuvas_BH, by=c(date(Saida_Real)= date(DateTime)))
Error: unexpected '=' in "DB_with_rain <- left_join(DB_FEB_2019_join,Chuvas_BH, by=c(date(Saida_Real)="
Within in the by, we cannot do the conversion - it expects the column name as a string. It should be done before the left_join
library(dplyr)
DF_FEB_2019_join %>%
mutate(Saida_Real = as.Date(Saida_Real, format = "%Y/%m/%d %H:%M")) %>%
left_join(Chuvas_BH %>%
mutate(DateTime = as.Date(DateTime, format = "%Y/%m/%d %H:%M")),
by = c(Saida_Real = "DateTime"))
With lubridate function, the as.Date can be replaced with ymd_hm and convert to Date class with as.Date

Strptime fails when working with a dataframe

Strptime seems to be missing something in this scenario:
aDateInPOSIXct <- strptime("2018-12-31", format = "%Y-%m-%d")
someText <- "asdf"
df <- data.frame(aDateInPOSIXct, someText, stringsAsFactors = FALSE)
bDateInPOSIXct <- strptime("2019-01-01", format = "%Y-%m-%d")
df[1,1] <- bDateInPOSIXct
Assignment of bDate to the dataframe fails with:
Error in as.POSIXct.numeric(value) : 'origin' must be supplied
And a warning:
provided 11 variables to replace 1 variables
I want to use both POSIXct dates and POSIXct date-times to compare this and that. It's way less work than manipulating character strings -- and POSIX takes care of the time zone issues. Unfortunately, I'm missing something.
You only need to cast your calls to strptime to POSIXct explicitly:
aDateInPOSIXct <- as.POSIXct(strptime("2018-12-31", format = "%Y-%m-%d"))
someText <- "asdf"
df <- data.frame(aDateInPOSIXct, someText, stringsAsFactors = FALSE)
bDateInPOSIXct <- as.POSIXct(strptime("2019-01-01", format = "%Y-%m-%d"))
df[1,1] <- bDateInPOSIXct
Check the R documentation which says:
Character input is first converted to class "POSIXlt" by strptime: numeric input is first converted to "POSIXct".

I want to run code on data frame up to a certain date (column 2)

I am trying to run code on a data frame up to a certain date. I have individual game statistics, the second column is Date in order. I thought this is how to do this however I get an error:
Error in `[.data.frame`(dfmess, dfmess$Date <= Standingdate) :
undefined columns selected
Here is my code:
read.csv("http://www.football-data.co.uk/mmz4281/1516/E0.csv")
dfmess <- read.csv("http://www.football-data.co.uk/mmz4281/1516/E0.csv", stringsAsFactors = FALSE)
Standingdate <- as.Date("09/14/15", format = "%m/%d/%y")
dfmess[dfmess$Date <= Standingdate] -> dfmess
You probably want to convert dfmess$Date to as.Date first prior to comparing. In addition, per #Roland's comment, you require an additional comma ,:
dfmess <- read.csv("http://www.football-data.co.uk/mmz4281/1516/E0.csv", stringsAsFactors = FALSE)
dfmess$Date <- as.Date(dfmess$Date, "%m/%d/%y")
Standingdate <- as.Date("09/14/15", format = "%m/%d/%y")
dfmess[dfmess$Date <= Standingdate, ]

Resources