How do I rename a date field in SQLDF without changing the format?
See my example below where my renamed date field "dt" converts the date to a number. How do I avoid this, or convert it back to a date?
#Question for Stack Exchange
df <- data.frame (date = c("2014-12-01","2014-12-02","2014-12-03"),
acct = c(1,2,3))
df$date = as.Date(df$date)
library("sqldf")
sqldf('
select
date as dt,
date,
acct
from df ')
dt date acct
1 16405 2014-12-01 1
2 16406 2014-12-02 2
3 16407 2014-12-03 3
Specify the method as follows:
sqldf('select date as dt__Date,
date as date__Date,
acct
from df',
method = "name__class")
Related
I want to change the column of "dob" to date format.
I used the below code but did not see changes in my data.
Date = as.Date(ped$dob)
Can you guide me through this?
ID sex dob yob
1: 126000 M 20220523 2022
2: 375000 M 20220523 2022
Try this:
note the big Y in the format
as.Date(as.character(ped$dob), '%Y%m%d')
I have a dataframe with the column name perioden. This column contains the date but it is written in this format: 2010JJ00, 2011JJ00, 2012JJ00, 2013JJ00 etc..
This column is also a character when I look at the structure. I've tried multiple solutions but so far am still stuck, my qeustion is how can I convert this column to a date and how do I remove the JJ00 part so that you only see the year format of the column.
You can try this approach. Using gsub() to remove the non desired text (as said by #AllanCameron) and then format to date using paste0() to add the day and month, and as.Date() for date transformation:
#Data
df <- data.frame(Date=c('2010JJ00', '2011JJ00', '2012JJ00', '2013JJ00'),stringsAsFactors = F)
#Remove string
df$Date <- gsub('JJ00','',df$Date)
#Format to date, you will need a day and month
df$Date2 <- as.Date(paste0(df$Date,'-01-01'))
Output:
Date Date2
1 2010 2010-01-01
2 2011 2011-01-01
3 2012 2012-01-01
4 2013 2013-01-01
We can use ymd with truncated option
library(lubridate)
library(stringr)
ymd(str_remove(df$Date, 'JJ\\d+'), truncated = 2)
#[1] "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01"
data
df <- data.frame(Date=c('2010JJ00', '2011JJ00', '2012JJ00', '2013JJ00'), stringsAsFactors = FALSE)
I have looked at different options from previous answers, but none has given me the correct output.
I would like to separate timestamp into date and time using R
sorted_transactions_table$TRANSACTION_DATE <- as.Date(sorted_transactions_table$TRANSACTION_TIME)
I have tried this but I get an error:
Error in charToDate(x) : character string is not in a standard
unambiguous format
Timestamp from my dataset is in the format:
01-OCT-18 12.01.23.000000 AM
Convert it into standard datetime format first and then use format
df$TRANSACTION_DATE <- as.POSIXct(df$TRANSACTION_DATE,
format = "%d-%b-%y %H.%M.%OS %p")
transform(df, Date = as.Date(TRANSACTION_DATE),
#Also Date = format(TRANSACTION_DATE, "%Y-%m-%d") would work
time = format(TRANSACTION_DATE, "%T"))
# col1 TRANSACTION_DATE Date time
#1 1 2018-10-01 12:01:23 2018-10-01 12:01:23
#2 2 2018-10-01 12:02:23 2018-10-01 12:02:23
#3 3 2018-10-01 12:03:23 2018-10-01 12:03:23
You could also do this in dplyr chain
library(dplyr)
df %>%
mutate(TRANSACTION_DATE = as.POSIXct(TRANSACTION_DATE,
format = "%d-%b-%y %H.%M.%OS %p"),
Date = as.Date(TRANSACTION_DATE),
time = format(TRANSACTION_DATE, "%T"))
Read ?strptime for all formatting options.
data
Using a reproducible example
df <- data.frame(col1 = 1:3, TRANSACTION_DATE = c("01-OCT-18 12.01.23.000000 AM",
"01-OCT-18 12.02.23.000000 AM", "01-OCT-18 12.03.23.000000 AM"))
df
# col1 TRANSACTION_DATE
#1 1 01-OCT-18 12.01.23.000000 AM
#2 2 01-OCT-18 12.02.23.000000 AM
#3 3 01-OCT-18 12.03.23.000000 AM
I would use the lubridate package:
library(lubridate)
library(dplyr)
df %>%
mutate(TRANSACTION_DATE = dmy_hms(TRANSACTION_DATE),
Date = date(TRANSACTION_DATE),
time = format(TRANSACTION_DATE, "%T"))
I want to compare the data from one column which is the end date(end_date) with the system date(todays_date). Both columns are in the char format.
Input:
$ name: chr "Abby" "Abby" "Abby" "Abby" ...
$ std: int 2 3 4 5 6 7 8 9 10 11 ...
$ end_date: chr "25-02-2016" "25-02-2016" "25-03-2018" "25-02-2019" ...
$ todays_date: chr "07-03-2018" "07-03-2018" "07-03-2018" "07-03-2018" ...
Is there any way I can pass a sqldf statement where I can get all the values of the input csv where end_date < todays_date? Any way other than a sqldf statement where I can extract the values of the csv where end_date< todays_date will do.
I tried a few possible variations the below query but I can't seem to get the required output:
sel_a <- sqldf(paste("SELECT * FROM input_a WHERE end_date<",
todays_date, "", sep = ""))
sel_a
PS: I have a huge amount of data and have reduced it to fit this question.
Any help would be appreciated.
To get a more specific answer, make a reproducible example
Convert the date column from character to date-time objects, e.g., with
library(lubridate)
your_df$end_date <- mdy(your_df$end_date)
Then, you don't even need a column for todays date, just use it as a filter condition
library(dplyr)
filter(your_df, end_date < Sys.Date())
# will return a data frame with those rows that have a date before today.
Or if you prefer:
your_df[your_df$end_date < Sys.Date(),]
# produces the same rows
Using the raw input shown in the Note at the end first convert the dates to "Date" class and then use any of the alternatives shown. The first two use end_date in the input and the last two use Sys.Date(). We show both sqldf and base solutions.
library(sqldf)
fmt <- "%d-%m-%Y"
Input <- transform(Input_raw, end_date = as.Date(end_date, fmt),
todays_date = as.Date(todays_date, fmt))
# 1
sqldf("select * from Input where end_date <= todays_date")
# 2
subset(Input, end_date <= todays_date)
# 3
fn$sqldf("select * from Input where end_date <= `Sys.Date()`")
# 4
subset(Input, end_date <= Sys.Date())
Note
The Input in reproducible form:
Input_raw <- data.frame(name = "Abby", std = 2:5,
end_date = c("25-02-2016", "25-02-2016", "25-03-2018", "25-02-2019"),
todays_date = "07-03-2018", stringsAsFactors = FALSE)
I am trying to add to a date using sqldf, i know it should be simple but I can't figure out what is wrong with my date format. Using:
sqldf("select date(model_date, '+1 day') from lapse_test")
give's answers like '-4666-01-23'
The model_date's are in the date format and look like 2015-01-01
I previously made them from a character string ('12/1/2015') using
lapse_test$model_date <- as.Date(lapse_test$date1,format = "%m/%d/%Y") or
lapse_test$model_date <- as.POSIXCT(lapse_test$date1,format = "%m/%d/%Y")
I'm guessing this is the problem? Any ideas?
Passing a character variable to the date() function seems to work:
df <- data.frame(a=as.Date("2010-10-01"))
df$b <- as.character(df$a)
sqldf("select date(a) from df")
# date(a)
# 1 -4672-08-24
sqldf("select date(b) from df")
# date(b)
# 1 2010-10-01
sqldf("select date(b, '+1 day') from df")
# date(b, '+1 day')
# 1 2010-10-02
Note that you can do (some) arithmetic on Date objects in R directly, without needing SQL:
df$a <- df$a + 1
df
# a b
# 1 2010-10-02 2010-10-01
SQLite date functions consider dates as days since Nov 24, 4714BC, which means the integer storage of 16770 for the example date of 2015-12-01 in R returns an ancient date somewhere in 4667BC.
You can figure out that the difference between the R origin of 1970-01-01 and the SQLite origin is 2440588 days. Which means, you can take this constant into account if you want:
test <- data.frame(model_date=as.Date("12/1/2015",format="%m/%d/%Y"))
sqldf("select date(model_date + 2440588, '+1 day') as select_date from test")
# select_date
#1 2015-12-02
#HongOoi's answer is probably better, but I thought this might be interesting to know the underlying workings.