I need to get the full date with timestamp from a column that has a format 'yyyymm'. For example i need to get 2007-01-01 00:00:00:000 from 200701.
My Column 'A' consists of:
200701
200702
200703
...
...
...
I need another to calculate another column 'B' showing:
2007-01-01 00:00:00.000
2007-02-01 00:00:00.000
2007-03-01 00:00:00.000
2007-04-01 00:00:00.000
Column B has to be a calculation based on Column A or Sys.Calendar. Using platform Teradata 14.
Please let me know the answer. Thank you in advance for your answers.
If the datatype is a string:
cast(col as timestamp(3) format 'yyyymm')
If it's numeric:
cast(cast(col * 100 - 19000000 + 1 as date) as timestamp(3))
Related
I want to change the column of "dob" to date format.
I used the below code but did not see changes in my data.
Date = as.Date(ped$dob)
Can you guide me through this?
ID sex dob yob
1: 126000 M 20220523 2022
2: 375000 M 20220523 2022
Try this:
note the big Y in the format
as.Date(as.character(ped$dob), '%Y%m%d')
I have this table, date is a TEXT field and the only field.
date
2020-01-01
2010-03-01
2010-06-01
2011-01-01
2012-01-01
2013-01-01
2014-01-01
2015-01-01
I want the table to join itself on the date that is 1 year smaller than the target record, I tried this. However it doesn't seemed to work when I only add number after doing a strftime.
SELECT d0.*, d1.* from the_table d0 left join the_table d1 on strftime(d0."date",'%Y') = strftime(d1."date",'%Y') + 1;
What I want is the following result
date date
2020-01-01 None
2010-03-01 2011-01-01
2010-06-01 2011-01-01
2011-01-01 2012-01-01
...
But this is what it returned instead.
I have several questions regarding this issue?
Besides the example that joins table on a specific difference in year. How do I do this for months, days etc?
Does the strftime uses the index if there's an index created on that field? The date field is the primary key field in the example. How do I know if I'm using indices? If not how do I make it use the index?
The syntax for the function strftime() requires the format to be the first argument and the date to be the second.
Also, strftime() returns a string, so you must convert it to a number (implicitly by adding 0) if you want to compare it to a number:
SELECT d0.*, d1.*
FROM the_table d0 LEFT JOIN the_table d1
ON strftime('%Y', d0."date") + 1 = strftime('%Y', d1."date") + 0;
See the demo.
Results:
date
date
2020-01-01
null
2010-03-01
2011-01-01
2010-06-01
2011-01-01
2011-01-01
2012-01-01
2012-01-01
2013-01-01
2013-01-01
2014-01-01
2014-01-01
2015-01-01
2015-01-01
null
You can apply the same code by changing the format to '%m' or '%d' to compare month or day respectively, if the year is not relevant.
But, if you want to join on the next day of each date you can do it with the function date():
SELECT d0.*, d1.*
FROM the_table d0 LEFT JOIN the_table d1
ON date(d0."date", '+1 day') = d1."date";
Also, strftime() and date() are functions and normally SQLite would not use any index with these functions.
SQLite supports indexes on expressions (also: SQLite Expression-based Index), but I don't think that this would help in your case.
Not an exact answer or correction to your current approach, but you could use the DATE() function here with an offset of 1 year:
SELECT d0.*, d1.*
FROM the_table d0
LEFT JOIN the_table d1
ON d0."date" = DATE(d1."date", '+1 year');
I have a data with Date as follows:
2010-01-01
2010-02-07
2010-02-09
2010-03-09
2010-04-06
....
2021-03-31
2021-04-10
I want an output with number of observed Month based on Date as above such as: 1,2,3...100
I tried this code as.numeric(as.factor(format(flights.input$Date,"%m")))
But it stops counting at 12, and counts again from 1 while I want to count consecutively.
You can try:
data.table::setDT(df)[, NumberOfMonth := rleid(format(as.Date(as.character(Date)), "%m"))]
We. can use rle from base R to create the sequence after extracting the month from the 'Date' column
fm1 <- format(flights.input$Date, "%m")
with(rle(fm1), rep(seq_along(values), lengths))
enter image description hereI have a hive table with more than millions records.
The input is of the following type:
Input:
rowid |starttime |endtime |line |status
--- 1 2007-07-19 00:05:00 2007-07-19 00:23:00 l1 s1
--- 2 2007-07-20 00:00:10 2007-07-20 00:22:00 l1 s2
--- 3 2007-07-19 00:00:00 2007-07-19 00:11:00 l2 s2
What I want to do is when 1st order the table by starttime group by line.
Then find the difference between two consecutive rows endtime and starttime. If the difference is more than 5mins then in a new table add a new row in between with status misstime.
In input row 1 & 2 the time time difference is 1 hour 10 mins so 1st I will create row for 19th Date and complete that days with missing time and then add one more row for 20th as below.
output:
rowid |starttime |endtime |line |status
--- 1 |2007-07-19 00:05:00 |2007-07-19 00:23:00 |l1 |s1
--- 2 |2007-07-19 00:23:01 |2007-07-19 00:00:00 |l1 |misstime
--- 3 |2007-07-20 00:00:01 |2007-07-20 00:00:09 |l1 |misstime
--- 4 |2007-07-20 00:00:10 |2007-07-20 00:22:00 |l1 |s2
--- 3 |2007-07-19 00:00:00 |2007-07-19 00:11:00 |l2 |s2
Can anyone help me achieve this directly in hue - hive ?
Unix script will also do.
Thanks in advance.
The solution template is:
Use LAG() function to get previous line starttime or endtime.
For each line calculate the different between current and previous time
Filter rows with difference more than 5 minutes.
Transform the dataset into required output.
Example:
insert into yourtable
select
s.rowid,
s.starttime ,
s.endtime,
--calculate your status here, etc, etc
from
(
select rowid starttime endtime,
lag(endtime) over(partition by rowid order by starttime) prev_endtime
from yourtable ) s
where (unix_timestamp(endtime) - unix_timestamp(prev_endtime))/60 > 5 --latency>5 min
Hi this is a two part question.
How to create a auto incrementing data frame for dates?
I want to auto create a data frame with column "dates" with values in one month intervals from 2011-05-01 (1st May 2011) till today (2015-12-01).
Output:
S.no. Date
1 2011-05-01
2 2011-06-01
3 2011-07-01
. .
55 2015-12-01
Second I have a data frame with customer name and his expiry date for example:
names<-c("Tom","David")
expiryDate<-as.Date(c("2011-05-22","2011-06-19"))
df<-data.frame(names,expiryDate)
df
Name Expirydate
Tom 2011-05-22
David 2011-06-19
I want to process the expiry dates to check whether customer is active in that month.
Name 2011-05-01 2011-06-01 2011-07-01 ... (till 2015-12-01)
Tom TRUE FALSE FALSE
David TRUE TRUE FALSE
As #Roland mentioned you can use seq.Date to generate sequence of dates,
DateColumns <- seq.Date(as.Date("2011/05/01"), as.Date("2015/12/1"), by = "1 month")
DateColumnvalues <- t(sapply(df$expiryDate, function(x) x > DateColumns))
x <- data.frame(DateColumnvalues, row.names = df$names)
colnames(x) <- DateColumns
Generating a sequence of dates(DateColumns) for 1st of every month and then checking if expiryDate is greater than that dates using sapply.
The first line of the code would answer first part of your question as well.