Find Closest Date - sqlite

I'm trying to run a query that will return rows sorted by closest to today's date.
Here is some data:
| date |
|----------|
|2012-12-02|
|2012-12-04|
|2012-12-10|
|2012-12-15|
|2012-12-29|
|2013-01-02|
|2013-01-04|
Here is my query:
SELECT * FROM days
ORDER BY ABS( strftime( "%s", date ) - strftime( "%s", 2012-12-28 ) ) ASC
It just returns the rows in the same order I posted above, I want to get a result like
| date |
|----------|
|2012-12-29|
|2013-01-02|
|2013-01-04|
|2012-12-15|
|2012-12-10|
|2012-12-04|
My date field is a string in the format yyyy-MM-dd (there's a reason I'm not storing it as a timestamp). What am I doing wrong?

There seems to be a mistake on the code:
SELECT * FROM days
ORDER BY ABS( strftime( "%s", date ) - strftime( "%s", 2012-12-28 ) ) ASC
Written this way, the query will show the results just ordered by date.
The reason: 2012-12-28 will be treated as an arithmetic operation between integers. You should write '2012-12-28', to indicate that this is a date.

You don't have to use strftime.
SELECT * FROM days
WHERE date <= '2012-12-28'
ORDER BY date ASC
-- LIMIT 5

Related

Impala - Working hours between two dates in impala

I have two time stamps #starttimestamp and #endtimestamp. How to calculate number of working hours between these two
Working hours is defined below:
Mon- Thursday (9:00-17:00)
Friday (9:00-13:00)
Have to work in impala
think i found a better solution.
we will create a series of numbers using a large table. You can get a time dimension type table too. Make it doenst get truncated. I am using a large table from my db.
Use this series to generate a date range between start and end date.
date_add (t.start_date,rs.uniqueid) -- create range of dates
join (select row_number() over ( order by mycol) as uniqueid -- create range of unique ids
from largetab) rs
where end_date >=date_add (t.start_date,rs.uniqueid)
Then we will calculate total hour difference between the timestamp using unix timestamp considering date and time.
unix_timestamp(endtimestamp - starttimestamp )
Exclude non working hours like 16hours on M-T, 20hours on F, 24hours on S-S.
case when dayofweek ( dday) in (1,7) then 24
when dayofweek ( dday) =5 then 20
else 16 end as non work hours
Here is complete SQL.
select
end_date, start_date,
diff_in_hr - sum(case when dayofweek ( dday) in (1,7) then 24
when dayofweek ( dday) =5 then 20
else 16 end ) total_workhrs
from (
select (unix_timestamp(end_date)- unix_timestamp(start_date))/3600 as diff_in_hr , end_date, start_date,date_add (t.start_date,rs.uniqueid) as dDay
from tdate t
join (select row_number() over ( order by mycol) as uniqueid from largetab) rs
where end_date >=date_add (t.start_date,rs.uniqueid)
)rs2
group by 1,2,diff_in_hr

Carry Forward values in Presto

I am using the below query to pivot my data and generate a CSV but the problem is I have a dataset in which the data points are coming in a scattered way with each timestamp.
with map_date as (
SELECT
vin,
epoch,
timestamp,
date,
map_agg(signalName, value) as map_values
from hive.vehicle_signals.vehicle_signals_flat
where date(date) = date('2020-03-12')
and date(cast(from_unixtime(epoch) as timestamp) - interval '0' hour) = current_date - interval '2' day
and vin = '000011'
and signalName in ('timestamp','epoch','msgId','usec','vlan','vin','msgName','value')
GROUP BY vin, epoch, timestamp, date
order by timestamp desc
)
SELECT
epoch
, timestamp
, CASE WHEN element_at(map_values, 'value') IS NOT NULL THEN map_values['value'] ELSE NULL END AS value
, vin
, current_date - interval '2' day AS date
from map_date
I get the following CSV as a result. Is there a way I can carry forward the value until a new value is found at a newer timestamp? Like in the image below the value '14.3' comes and the next value '16.5' comes after a few timestamps, How can I carry the value '14.3' till row 7th and repeat the logic on the entire column. How can I make my output field look like column 'G' in the image using Presto?
Thanks in advance!!
You can use a mysql #variable to store the last value, for example:
SELECT
epoch
, timestamp
, CASE WHEN element_at(map_values, 'value') IS NOT NULL THEN #last_value:= map_values['value'] ELSE #last_value END AS value
, vin
, current_date - interval '2' day AS date
from map_date, (select #last_value:=0) v
The last part, (select #last_value:=0) v is to initialize the #last_value variable.
A basic tutorial
https://www.mysqltutorial.org/mysql-variables/
More advanced tutorial with additional info
https://www.xaprb.com/blog/2006/12/15/advanced-mysql-user-variable-techniques/

SQL: Sum based on specified dates

Thanks again for the help everyone. I went with the script below...
SELECT beginning, end,
(SELECT SUM(sale) FROM sales_log WHERE date BETWEEN beginning AND `end` ) AS sales
FROM performance
and I added a salesperson column to both the performance table and sales_log but it winds up crashing DB Browser. What is the issue here? New code below:
SELECT beginning, end, salesperson
(SELECT SUM(sale) FROM sales_log WHERE (date BETWEEN beginning AND end) AND sales_log.salesperson = performance.salesperson ) AS sales
FROM performance
I believe that the following may do what you wish or be the basis for what you wish.
WITH sales_log_cte AS
(
SELECT substr(date,(length(date) -3),4)||'-'||
CASE WHEN length(replace(substr(date,instr(date,'/')+1,2),'/','')) < 2 THEN '0' ELSE '' END
||replace(substr(date,instr(date,'/')+1,2),'/','')||'-'||
CASE WHEN length(substr(date,1,instr(date,'/') -1)) < 2 THEN '0' ELSE '' END||substr(date,1,instr(date,'/') -1) AS date,
CAST(sale AS REAL) AS sale
FROM sales_log
),
performance_cte AS
(
SELECT substr(beginning,(length(beginning) -3),4)||'-'||
CASE WHEN length(replace(substr(beginning,instr(beginning,'/')+1,2),'/','')) < 2 THEN '0' ELSE '' END
||replace(substr(beginning,instr(beginning,'/')+1,2),'/','')||'-'||
CASE WHEN length(substr(beginning,1,instr(beginning,'/') -1)) < 2 THEN '0' ELSE '' END||substr(beginning,1,instr(beginning,'/') -1)
AS beginning,
substr(`end`,(length(`end`) -3),4)||'-'||
CASE WHEN length(replace(substr(`end`,instr(`end`,'/')+1,2),'/','')) < 2 THEN '0' ELSE '' END
||replace(substr(`end`,instr(`end`,'/')+1,2),'/','')||'-'||
CASE WHEN length(substr(`end`,1,instr(`end`,'/') -1)) < 2 THEN '0' ELSE '' END||substr(`end`,1,instr(`end`,'/') -1)
AS `end`
FROM performance
)
SELECT beginning, `end` , (SELECT SUM(sale) FROM sales_log_cte WHERE date BETWEEN beginning AND `end` ) AS sales
FROM performance_cte
;
From your data this results in :-
As can be seen the bulk of the code is converting the dates into a format (i.e. YYYY-MM-DD) that is usable/recognisable by SQLite for the BETWEEN clause.
Date And Time Functions
I don't believe that you want a join between performance (preformance_cte after reformatting the dates) and sales_log (sales_log_cte) as this will be a cartesian product and then sum will sum all the results within the range.
The use of end as a column name is also awkward as it is a KEYWORD requiring it to be enclosed (` grave accents used in the above).
The above works by using 2 CTE's (Common Table Expresssions), which are temporary tables who'd life time is for the query in which they are used.
The first sales_log_cte is simply the sales_log table but with the date reformatted. The second, likewise, is simply the performace table with the dates reformatted.
If the tables already has suitable date formatting then all of the above could simply be :-
SELECT beginning, `end` , (SELECT SUM(sale) FROM sales_log WHERE date BETWEEN beginning AND `end` ) AS sales FROM performance;

hive time_stamp convert into UTC with time_offset in UTC

I have 2 columns: time_stamp and time_offset. Both are STRING data type.
How can we convert one column values into UTC with the help of second column which is in UTC? Is their any hive or from unix solution to convert time_stamp column into UTC?
hive> select time_stamp from table1 limit 2;
OK
20170717-22:31:57.348
20170719-21:10:15.393
[yyyymmdd-hh:mm:ss.msc] this column is in local time
hive> select time_offset from table1 limit 2;
OK
-05:00
+05:00
[‘+hh:mm’ or ‘-hh:mm’ ] this column is in UTC
You can use the Hive Date Functions unix_timestamp and from_unixtime to perform the conversion.
Code
WITH table1 AS (
SELECT '20170717-22:31:57.348' AS time_stamp, '-05:00' AS time_offset UNION ALL
SELECT '20170719-21:10:15.393' AS time_stamp, '+05:00' AS time_offset
)
SELECT
time_stamp,
time_offset,
unix_timestamp(concat(time_stamp, ' ', time_offset), 'yyyyMMdd-HH:mm:ss.SSS X') AS unix_timestamp_with_offset,
from_unixtime(unix_timestamp(concat(time_stamp, ' ', time_offset), 'yyyyMMdd-HH:mm:ss.SSS X'), 'yyyyMMdd-HH:mm:ss.SSS') AS string_timestamp_with_offset
FROM table1
;
Result Set
+------------------------+--------------+-----------------------------+-------------------------------+--+
| time_stamp | time_offset | unix_timestamp_with_offset | string_timestamp_with_offset |
+------------------------+--------------+-----------------------------+-------------------------------+--+
| 20170717-22:31:57.348 | -05:00 | 1500348717 | 20170717-20:31:57.000 |
| 20170719-21:10:15.393 | +05:00 | 1500480615 | 20170719-09:10:15.000 |
+------------------------+--------------+-----------------------------+-------------------------------+--+
Explanation
unix_timestamp can accept an optional format string in the same syntax as Java SimpleDateFormat. I am guessing that your offsets are using the ISO 8601 syntax, so let's use the X format specifier. Then, we can use the concat String Operator to combine time_stamp and time_offset before passing to unix_timestamp.
The unix_timestamp function results in a numeric timestamp specified as seconds since epoch. To convert that back to a string representation, we can pass the result obtained from unix_timestamp through from_unixtime, this time specifying our original format specifier.
(Please do test thoroughly to make sure the results are making sense in your environment. Time zone math can be tricky.)

Not able to filter records based on date filter in Informix

I want to put filter on an Informix query:
WHERE agentstatedetail.eventdatetime < '1753-01-01 00:00:00' - INTERVAL(3) DAY TO DAY
but it fails ...
Please tell where it goes wrong.
As noted in a comment, the solution is to ensure that the string is interpreted as a DATETIME value. The simple way to do that is to use the DATETIME literal notation:
DATETIME(1753-01-01 00:00:00) YEAR TO SECOND
To demonstrate:
CREATE TABLE agentstatedetail
(
eventdatetime DATETIME YEAR TO SECOND NOT NULL PRIMARY KEY,
eventname VARCHAR(64) NOT NULL
);
INSERT INTO agentstatedetail VALUES('1752-12-25 12:00:00', 'Christmas Day, Noon, 1752');
INSERT INTO agentstatedetail VALUES('1752-12-31 12:00:00', 'New Year''s Eve, Noon, 1752');
INSERT INTO agentstatedetail VALUES('1753-01-01 12:00:00', 'New Year''s Day, Noon, 1753');
SELECT * FROM agentstatedetail WHERE agentstatedetail.eventdatetime < '1753-01-01 00:00:00' - INTERVAL(3) DAY TO DAY;
This is your original WHERE clause embedded into a minimal SELECT statement. It yields the error:
SQL -1261: Too many digits in the first field of datetime or interval.
(NB: It would have been helpful to include the error message in the question.)
Here's an alternative version of the query, with the DATETIME literal in place:
SELECT * FROM agentstatedetail
WHERE agentstatedetail.eventdatetime < DATETIME(1753-01-01 00:00:00) YEAR TO SECOND -
INTERVAL(3) DAY TO DAY
;
Output from the sample data:
1752-12-25 12:00:00|Christmas DAY, Noon, 1752
I observe that the value calculated is a constant; you could rewrite the code as:
SELECT * FROM agentstatedetail
WHERE agentstatedetail.eventdatetime < DATETIME(1752-12-29 00:00:00) YEAR TO SECOND
I suspect that the value is passed as a parameter somewhere along the line.
Alternatively, you can cast the string to a DATETIME value and you'd get the same result:
SELECT * FROM agentstatedetail
WHERE agentstatedetail.eventdatetime < CAST('1753-01-01 00:00:00' AS DATETIME YEAR TO SECOND) -
INTERVAL(3) DAY TO DAY
;
or:
SELECT * FROM agentstatedetail
WHERE agentstatedetail.eventdatetime < '1753-01-01 00:00:00'::DATETIME YEAR TO SECOND -
INTERVAL(3) DAY TO DAY

Resources