SQLite Count() of true boolean values in GROUP - sqlite

I want to get messages from a view (it is a view because it contains messages from multiple sources) in an SQLite database. If a there are multiple messages from the same user, I want only the newest one. The view is already sorted by Date DESC.
If I use the query SELECT *, COUNT(IsRead = false) FROM Messages GROUP BY User I get the newest message from each user and the amount of messages for each user.
However instead of the total amount of messages, I only want the amount of unread messages (Read == false)
For the following table:
+-------+--------+------------+
| User | IsRead | Date |
+-------+--------+------------+
| User1 | false | 2020-01-05 |
| User2 | false | 2020-01-04 |
| User1 | false | 2020-01-03 |
| User3 | true | 2020-01-02 |
| User2 | true | 2020-01-01 |
| User3 | true | 2020-01-01 |
+-------+--------+------------+
I would like to get the following result
+-------+--------+------------+---------+
| User | IsRead | Date | notRead |
+-------+--------+------------+---------+
| User1 | false | 2020-01-05 | 2 |
| User2 | false | 2020-01-04 | 1 |
| User3 | true | 2020-01-02 | 0 |
+-------+--------+------------+---------+
How can I achieve that? The COUNT(IsRead = false) part in my query is just the way I imagined it could work. I could not find anything about how to do that in sqlite.
Also: Am I correct about always getting the most recent message from each user, if the output from the view is already sorted by Date descending? That seemed to be the case in my tests but I just want to make sure that was not a fluke.

From SELECT/The ORDER BY clause:
If a SELECT statement that returns more than one row does not have an
ORDER BY clause, the order in which the rows are returned is
undefined.
This means that even if the view is defined to return sorted rows, selecting from the view is not guaranteed to maintain that sort order.
Also, with this query:
SELECT *, COUNT(IsRead = false) FROM Messages GROUP BY User
even if you get the newest message from each user, this is not guaranteed, because it is not a documented feature.
So, for your question:
Am I correct about always getting the most recent message from each
user, if the output from the view is already sorted by Date
descending?
the answer is no.
You can't rely on coincidental results, but you can rely on documented features.
A documented feature is the use of Bare columns in an aggregate query which can solve your problem with a simple aggregation query:
SELECT User,
IsRead,
MAX(Date) Date,
SUM(NOT IsRead) notRead -- sums all 0s (false) by converting them to 1s (true)
FROM Messages
GROUP BY User;
Or, use window functions:
SELECT DISTINCT User,
FIRST_VALUE(IsRead) OVER (PARTITION BY User ORDER BY Date DESC) IsRead,
Max(Date) OVER (PARTITION BY User) Date,
SUM(NOT IsRead) OVER (PARTITION BY User) notRead
FROM Messages;
See the demo.

Related

Enforce uniqueness within a date range or based on the value of another column

I have a table with a large amount of data; moving forward, I would like to enforce uniqueness for a given column in this table. However, the table contains a large amount of rows where that column is non-unique. I am not able to delete or alter these rows.
Is it possible to enforce uniqueness over a given date range, or since a specific date, or based on the value of another column (or something else like that) in MariaDB?
You can create a UNIQUE index on multiple columns, where one column is nullable. MariaDB will see each column with NULL values as a different value regarding the UNIQUE index, even if the other column values of the UNIQUE index are the same. Check the MariaDB documentation Getting Started with Indexes - Unique Index:
The fact that a UNIQUE constraint can be NULL is often overlooked. In SQL any NULL is never equal to anything, not even to another NULL. Consequently, a UNIQUE constraint will not prevent one from storing duplicate rows if they contain null values:
CREATE TABLE t1 (a INT NOT NULL, b INT, UNIQUE (a,b));
INSERT INTO t1 values (3,NULL), (3, NULL);
SELECT * FROM t1;
+---+------+
| a | b |
+---+------+
| 1 | 1 |
| 2 | 1 |
| 2 | 2 |
| 3 | NULL |
| 3 | NULL |
+---+------+
You can create such a UNIQUE index on the date column you already have and a new column which indicates if the date value should be unique or not:
CREATE TABLE Foobar(
id INT AUTO_INCREMENT PRIMARY KEY NOT NULL,
createdAt DATE NOT NULL,
dateUniqueMarker BIT NULL DEFAULT 0,
UNIQUE KEY uq_createdAt(createdAt, dateUniqueMarker)
);
INSERT INTO Foobar(createdAt) VALUES ('2021-11-04'),('2021-11-05'),('2021-11-06');
SELECT * FROM Foobar;
+----+------------+------------------------------------+
| id | createdAt | dateUniqueMarker |
+----+------------+------------------------------------+
| 1 | 2021-11-04 | 0x00 |
| 2 | 2021-11-05 | 0x00 |
| 3 | 2021-11-06 | 0x00 |
+----+------------+------------------------------------+
INSERT INTO Foobar(createdAt) VALUES ('2021-11-05');
ERROR 1062 (23000): Duplicate entry '2021-11-05-\x00' for key 'Foobar.uq_createdAt'
UPDATE Foobar SET dateUniqueMarker = NULL WHERE createdAt = '2021-11-05';
INSERT INTO Foobar(createdAt, dateUniqueMarker) VALUES ('2021-11-05', NULL);
SELECT * FROM Foobar;
+----+------------+------------------------------------+
| id | createdAt | dateUniqueMarker |
+----+------------+------------------------------------+
| 1 | 2021-11-04 | 0x00 |
| 2 | 2021-11-05 | NULL |
| 5 | 2021-11-05 | NULL |
| 3 | 2021-11-06 | 0x00 |
+----+------------+------------------------------------+
Without any data example and scenario illustration, it's hard to know. If you can update your question with those information, please do.
"Is it possible to enforce uniqueness over a given date range, or since a specific date, or based on the value of another column (or something else like that) in MariaDB?"
If by "enforce" you mean to create a new column then populate it with unique identifier, then yes it is possible. If what you really mean is to generate a unique value based on other column, that's also possible. Question is, how unique do you want it to be?
Is it like this unique?
column1
column2
column3
unique_val
2021-02-02
ABC
DEF
1
2021-02-02
CBD
FEA
1
2021-02-03
BED
GER
2
2021-02-04
ART
TOY
3
2021-02-04
ZSE
KSL
3
Whereby if it's the same date (on column1), it should have the same unique value regardless of column2 & column3 data.
Or like this?
column1
column2
column3
unique_val
2021-02-02
ABC
DEF
1
2021-02-02
CBD
FEA
2
2021-02-03
BED
GER
3
2021-02-04
ART
TOY
4
2021-02-04
ZSE
KSL
5
Taking all (or certain) columns to consider the unique value.
Both of the scenario above can be achieved in query without the need to alter the table, adding and populate a new column but of course, the latter is also possible.

How to scan DynamoDB table for retrieving only one item in each partition key

Let say I have a table with partition key "ID" and range key "Time" with the following items:
ID | Time | Data
------------------
A | 1 | abc
A | 2 | def
B | 2 | ghi
B | 3 | jkl
And I want to scan only one item in each partition that has the highest time value in each partition. So the outcome of the scan should look like:
ID | Time | Data
------------------
A | 2 | def
B | 3 | jkl
Is this possible with the DynamoDB's scan feature?
(I want to avoid scan all and do such filtering by myself).
If you want to fetch just a few IDs along with their highest Time, you can query with reverse index, so for every ID, you will have only 1 item read. But for this you need an existing list of IDs.
So for each ID, there will be:
1 query
1 item read
Otherwise, the only way is to scan everything unfortunately.

T-SQL Server ORDER BY date and nulls last

I am studying for exam 70-761 and there is a challenge asking to place nulls in the end when using order by, I know the result is this one:
select
orderid,
shippeddate
from Sales.Orders
where custid = 20
order by case when shippeddate is null then 1 else 0 end, shippeddate
what i don't know is why the 1 and 0 and how they affect the result can anyone clarify.
Best Regards,
Daniel
There are two parameters in your order clause, it like to split two groups and then continue sort items inside those groups
First, because 0 less than 1, so all the orders without shippeddate will be push to last.
Then we will order by shippeddate
Example:
orderID | shippeddate
| null
| today
| null
| yesterday
| tomorrow
First sort by case when shippeddate is null then 1 else 0 end we will got
orderID | shippeddate
| today
| yesterday
| tomorrow
| null
| null
then continue sort with shippeddate, we will got
| yesterday
| today
| tomorrow
| null
| null
hope it useful to you

Application Insights query to get time between 2 custom events

I am trying to write a query that will get me the average time between 2 custom events, sorted by user session. I have added custom tracking events throughout this application and I want to query the time it takes the user from 'Setup' event to 'Process' event.
let allEvents=customEvents
| where timestamp between (datetime(2019-09-25T15:57:18.327Z)..datetime(2019-09-25T16:57:18.327Z))
| extend SourceType = 5;
let allPageViews=pageViews
| take 0;
let all = allEvents
| union allPageViews;
let step1 = materialize(all
| where name == "Setup" and SourceType == 5
| summarize arg_min(timestamp, *) by user_Id
| project user_Id, step1_time = timestamp);
let step2 = materialize(step1
| join
hint.strategy=broadcast (all
| where name == "Process" and SourceType == 5
| project user_Id, step2_time=timestamp
)
on user_Id
| where step1_time < step2_time
| summarize arg_min(step2_time, *) by user_Id
| project user_Id, step1_time,step2_time);
let 1Id=step1_time;
let 2Id=step2_time;
1Id
| union 2Id
| summarize AverageTimeBetween=avg(step2_time - step1_time)
| project AverageTimeBetween
When I run this query it produces this error message:
'' operator: Failed to resolve table or column or scalar expression named 'step1_time'
I am relatively new to writing queries with AI and have not found many resources to assist with this problem. Thank you in advance for your help!
I'm not sure what the let 1id=step1_time lines are intended to do.
those lines are trying to declare a new value, but step1_time isn't a thing, it was a field in another query
i'm also not sure why you're doing that pageviews | take 0 and unioning it with events?
let allEvents=customEvents
| where timestamp between (datetime(2019-09-25T15:57:18.327Z)..datetime(2019-09-25T16:57:18.327Z))
| extend SourceType = 5;
let step1 = materialize(allEvents
| where name == "Setup" and SourceType == 5
| summarize arg_min(timestamp, *) by user_Id
| project user_Id, step1_time = timestamp);
let step2 = materialize(step1
| join
hint.strategy=broadcast (allEvents
| where name == "Process" and SourceType == 5
| project user_Id, step2_time=timestamp
)
on user_Id
| where step1_time < step2_time
| summarize arg_min(step2_time, *) by user_Id
| project user_Id, step1_time,step2_time);
step2
| summarize AverageTimeBetween=avg(step2_time - step1_time)
| project AverageTimeBetween
if I remove the things I don't understand (like union with 0 pageviews, and the lets, I get a result, but I don't have your data so I had to use other values than "Setup" and "Process" so I don't know if it is what you expect?
you might want to look at the results of the step2 query without the summarize to just see what you're getting matches what you expect.

SQLite how to filter GROUPs with different criteria

I have a set of data that contains a set of names, publishers and dates.
I am trying to find cases where a name exists on the same date, but without duplicate publishers.
I am able to find names that exist on the same date with this query:
SELECT * FROM list GROUP BY date HAVING COUNT(*) >= 2
however, I'm not sure how to show names that have a unique publisher within the one grouped date.
What comes to mind is using a subquery like:
SELECT * FROM list WHERE datething IN (
SELECT datething FROM list GROUP BY date HAVING COUNT(*) >= 2)
GROUP BY publisher HAVING COUNT(*) == 1
but this has the effect of eliminating all publishers, even if they only had one entry for a day.
For example..
Name | pub | datething
Arr | Yoda | 2016-07-09
Foo | Akbar | 2016-07-10
Bar | Akbar | 2016-07-10
Baz | Leia | 2016-07-10
Far | Luke | 2016-07-10
Bar2 | Akbar | 2016-07-11
Baz2 | Leia | 2016-07-11
Foo2 | Leia | 2016-07-11
Far2 | Luke | 2016-07-11
For 2016-07-10, I expect to see Baz and Far, becasue Foo and Bar are by the same publisher.
For 2016-07-11, I expect to see Bar2 and Far2.
I don't expect to see anything on 2016-07-09, because there's only one entry there.
However, because of the outer GROUP BY clause, I get 0 results - there are more than 1 publisher.
Any help is appreciated.
Thanks!
You need to group by datething and publisher for your second filter to work.
SELECT *
FROM list
WHERE datething IN (
SELECT datething
FROM list
GROUP BY datething
HAVING COUNT( * ) > 2
)
GROUP BY datething,
pub
HAVING COUNT( * ) == 1;

Resources