How would I return a count of all rows in the table and then the count of each time a specific status is found? - sqlite

Please forgive my ignorance on sqlalchemy, up until this point I've been able to navigate the seas just fine. What I'm looking to do is this:
Return a count of how many items are in the table.
Return a count of many times different statuses appear in the table.
I'm currently using sqlalchemy, but even a pure sqlite solution would be beneficial in figuring out what I'm missing.
Here is how my table is configured:
class KbStatus(db.Model):
id = db.Column(db.Integer, primary_key=True)
status = db.Column(db.String, nullable=False)
It's a very basic table but I'm having a hard time getting back the data I'm looking for. I have this working with 2 separate queries, but I have to believe there is a way to do this all in one query.
Here are the separate queries I'm running:
total = len(cls.query.all())
status_count = cls.query.with_entities(KbStatus.status, func.count(KbStatus.id).label("total")).group_by(KbStatus.status).all()
From here I'm converting it to a dict and combining it to make the output look like so:
{
"data": {
"status_count": {
"Assigned": 1,
"In Progress": 1,
"Peer Review": 1,
"Ready to Publish": 1,
"Unassigned": 4
},
"total_requests": 8
}
}
Any help is greatly appreciated.

I don't know about sqlalchemy, but it's possible to generate the results you want in a single query with pure sqlite using the JSON1 extension:
Given the following table and data:
CREATE TABLE data(id INTEGER PRIMARY KEY, status TEXT);
INSERT INTO data(status) VALUES ('Assigned'),('In Progress'),('Peer Review'),('Ready to Publish')
,('Unassigned'),('Unassigned'),('Unassigned'),('Unassigned');
CREATE INDEX data_idx_status ON data(status);
this query
WITH individuals AS (SELECT status, count(status) AS total FROM data GROUP BY status)
SELECT json_object('data'
, json_object('status_count'
, json_group_object(status, total)
, 'total_requests'
, (SELECT sum(total) FROM individuals)))
FROM individuals;
will return one row holding (After running through a JSON pretty printer; the actual string is more compact):
{
"data": {
"status_count": {
"Assigned": 1,
"In Progress": 1,
"Peer Review": 1,
"Ready to Publish": 1,
"Unassigned": 4
},
"total_requests": 8
}
}
If the sqlite instance you're using wasn't built with support for JSON1:
SELECT status, count(status) AS total FROM data GROUP BY status;
will give
status total
-------------------- ----------
Assigned 1
In Progress 1
Peer Review 1
Ready to Publish 1
Unassigned 4
which you can iterate through in python, inserting each row into your dict and adding up all total values in another variable as you go to get the total_requests value at the end. No need for another query just to calculate that number; do it manually. I bet it's really easy to do the same thing with your existing second sqlachemy query.

Related

Top N per Classification in CosmosDB

I'm kinda stuck on this issue. I have several hundreds of a certain model stored in ComsosDb and I can't seem to get the top 5 of each category.
This is the model:
"id": "06224840-6b88-4394-9324-4d1628383702",
"name": "Reservation",
"description": null,
"client": null,
"reference": null,
"isMonitoring": false,
"monitoringSince": null,
"hasRiskProfile": false,
"riskProfile": -1,
"monitorFrequency": 0,
"mainBindable": null,
"organizationId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"userId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"createDate": "2020-08-18T11:00:02.5266403Z",
"updateDate": "2020-08-18T11:00:02.5266419Z",
"lastMonitorDate": "2020-08-18T11:00:02.5266427Z"
So what i'm trying to do is use C# to get the top 5 from each risk profile where the organizationId matches. GroupBy through LINQ throws an error, same with a row_number() query combined with a PARTITION BY, doesn't seem to work either.
Any way I can get this to work in a single query compatible with cosmos?
EDIT:
What i am trying to achieve in CosmosDb is this roughly:
WITH TopEntries AS (
SELECT *
,ROW_NUMBER() OVER (
PARTITION BY [riskProfile]
ORDER BY [updateDate] DESC
) AS [ROW NUMBER]
WHERE [organizationId] = "xyz"
FROM [reservations]
)
SELECT * FROM TopEntries
WHERE TopEntries.[ROW NUMBER] <= 5
It sounds like combining TOP and ORDER BY would do the job. For example:
SELECT TOP 5 *
FROM c
WHERE c.organizationId = "xyz"
ORDER BY c.riskProfile
You can build such queries with parameters in the .NET SDK as in this sample.
The functionality you are trying to achieve is not directly possible through single query in Cosmos DB. There are 2 steps to do this(You can change as per you document sets)
Firstly you will have to group by like below:
SELECT c.city FROM c where c.org = 'xyz' group by c.city
Then loop through the result one by one from the first query like below:
SELECT TOP 5 * FROM C WHERE C.city = 'delhi' order by C.date desc
You can refer to similar issue here:
https://learn.microsoft.com/en-us/answers/questions/38454/index.html

Query ADX table name dynamically

I have a need to be able to query Azure Data Explorer (ADX) tables dynamically, that is, using application-specific metadata that is also stored in ADX.
If this is even possible, the way to do it seems to be via the table() function. In other words, it feels like I should be able to simply write:
let table_name = <non-trivial ADX query that returns the name of a table as a string>;
table(table_name) | limit 10
But this query fails since I am trying to pass a variable to the table() function, and "a parameter, which is not scalar constant string can't be passed as parameter to table() function". The workaround provided doesn't really help, since all the possible table names are not known ahead of time.
Is there any way to do this all within ADX (i.e. without multiple queries from the client) or do I need to go back to the drawing board?
if you know the desired output schema, you could potentially achieve that using union (note that in this case, the result schema will be the union of all tables, and you'll need to explicitly project the columns you're interested in)
let TableA = view() { print col1 = "hello world"};
let TableB = view() { print col1 = "goodbye universe" };
let LabelTable = datatable(table_name:string, label:string, updated:datetime)
[
"TableA", "MyLabel", datetime(2019-10-08),
"TableB", "MyLabel", datetime(2019-10-02)
];
let GetLabeledTable = (l:string)
{
toscalar(
LabelTable
| where label == l
| order by updated desc
| limit 1
)
};
let table_name = GetLabeledTable('MyLabel');
union withsource = T *
| where T == table_name
| project col1

What's the equivalent DynamoDB solution for this MySQL Query?

I'm familiar with MySQL and am starting to use Amazon DynamoDB for a new project.
Assume I have a MySQL table like this:
CREATE TABLE foo (
id CHAR(64) NOT NULL,
scheduledDelivery DATETIME NOT NULL,
-- ...other columns...
PRIMARY KEY(id),
INDEX schedIndex (scheduledDelivery)
);
Note the secondary Index schedIndex which is supposed to speed-up the following query (which is executed periodically):
SELECT *
FROM foo
WHERE scheduledDelivery <= NOW()
ORDER BY scheduledDelivery ASC
LIMIT 100;
That is: Take the 100 oldest items that are due to be delivered.
With DynamoDB I can use the id column as primary partition key.
However, I don't understand how I can avoid full-table scans in DynamoDB. When adding a secondary index I must always specify a "partition key". However, (in MySQL words) I see these problems:
the scheduledDelivery column is not unique, so it can't be used as a partition key itself AFAIK
adding id as unique partition key and using scheduledDelivery as "sort key" sounds like a (id, scheduledDelivery) secondary index to me, which makes that index pratically useless
I understand that MySQL and DynamoDB require different approaches, so what would be a appropriate solution in this case?
It's not possible to avoid a full table scan with this kind of query.
However, you may be able to disguise it as a Query operation, which would allow you to sort the results (not possible with a Scan).
You must first create a GSI. Let's name it scheduled_delivery-index.
We will specify our index's partition key to be an attribute named fixed_val, and our sort key to be scheduled_delivery.
fixed_val will contain any value you want, but it must always be that value, and you must know it from the client side. For the sake of this example, let's say that fixed_val will always be 1.
GSI keys do not have to be unique, so don't worry if there are two duplicated scheduled_delivery values.
You would query the table like this:
var now = Date.now();
//...
{
TableName: "foo",
IndexName: "scheduled_delivery-index",
ExpressionAttributeNames: {
"#f": "fixed_value",
"#d": "scheduled_delivery"
},
ExpressionAttributeValues: {
":f": 1,
":d": now
},
KeyConditionExpression: "#f = :f and #d <= :d",
ScanIndexForward: true
}

Ora-00904 - Error with creating a View

I'm stuck with creating a view in Oracle, but before I create the view, I always test it first and I always got this error: Ora-00904.
This is the situation. I have this one Set of Query let say Query A that I need to combined using UNION ALL with the Query A itself with only few modifications applied to create another bigger Set of Query - Query B. The main constraint that keeps me on doing this is the Database Design, and I'm not in the position in the company to change it, so I have to adapt to it. Query A unions Query A for 6 times creating Query B. The additional Major constraint is Query B is from 1 database user only, but there are 54 database users with the same structures that I need to fetch the same query. Query B (db user1) unions Query B (db user2) unions Query B (db user3) and so on until 54 then finally creating Query C --- the final output. My scrip has already reached 6048 lines, then I got this problem that I don't get when I test Query A and Query B. All my table names, owner names, and column names are all correct but I got that error.
This is the code (that needs to be repeated for 54x6 times) - the Query A. Query B applies some similar modification only.:
Select
'2013' "YEAR",
Upper(a.text_month) "MONTH",
Upper('Budget') "VERSION",
case
when length(b.level1_name) > 5 then 'Parent'
else 'SUBSIDIARIES'
end "COMPANY_GROUP",
case
when length(b.level1_name) < 6 and b.level1_name <> '1000' then 'Subsidiaries'
else '1000 Parent'
end "COMPANY",
case
when length(b.level1_name) < 6 and b.level1_name <> '1000' then 'SUBS'
else '1000'
end "COMPANY_CODE",
case
when length(b.level1_name) > 5 then 'Parent'
else 'SUBSIDIARIES'
end "COMPANY_NAME",
b.level1_displayname "DIVISION",
b.level1_name "DIVISION_CODE",
case
when length(b.level1_name) > 5 then ltrim(upper(substr(b.level1_displayname, 8)))
else upper(ltrim(substr(b.level1_displayname, 10)))
end "DIVISION_NAME",
upper(a.text_nature_of_trip) "NATURE_OF_TRAVEL",
upper(a.text_placeeventstraining) "TRAVEL_DETAILS",
upper(a.text_country) "COUNTRY",
a.text_name_of_employee "EMPLOYEE_NAME", a.float_no_of_attendees "NO_OF_ATTENDEES",
a.text_sponsored "SPONSORED",
a.text_remarks "REMARKS",
'OTHER TRAVEL EXPENSES' "COST_ELEMENT",
a.FLOAT_702120005_OTHER_TRAVEL_E "AMOUNT"
From PUBLISH_PNL_AAAA_2013.et_travel_transaction a,
PUBLISH_PNL_AAAA_2013.cy_2elist b
Where a.elist = b.level3_iid
ORA-00904 is "invalid column name" -- either you've spelled the column name wrongly, or prefixed it with the wrong table alias, omitted quotes from a string literal, or any number of other issues.
Check the point in the code that the error message mentions for mistakes like that.

SQLite: possible to update row or insert if it doesn't exist?

I'm sure I can check if a row exists by selecting it but I'm wondering if there's a slicker way that I'm just not aware of -- seems like a common enough task that there might be. This SQLite table looks something like this:
rowID QID ANID value
------ ------ ----- ------
0 axo 1 45
1 axo 2 12
If the combination of QID and ANID already exists, that value should be updated, if the combination of QID and ANID doesn't already exist then it should be inserted. While its simple enough to write:
SELECT * where QID = 'axo' and ANID = 3;
And check if the row exists then branch and either insert/update I can't help but look for a better way. Thanks in advance!
Beware: REPLACE doesn't really equal 'UPDATE OR INSERT'...REPLACE replaces the entire row. Therefore if you don't specify values for EVERY column, you'll replace the un-specified columns with NULL or the default values.
In a simple example such as above, it's likely fine, but if you get into the habit of using REPLACE as 'UPDATE OR INSERT' you'll nuke data when you forget to specify a value for every field...just a warning from experience.
The insert documentation details a REPLACE option
INSERT OR REPLACE INTO tabname (QID,ANID,value) VALUES ('axo',3,45)
The "INSERT OR REPLACE" can abbreviated to REPLACE.
Example in Perl
Revised the following example to utilize a composite primary key
use strict;
use DBI;
### Connect to the database via DBI
my $dbfile = "simple.db";
unlink $dbfile;
my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile");
### Create a table
$dbh->do("CREATE TABLE tblData (qid TEXT, anid INTEGER, value INTEGER, PRIMARY KEY(qid, anid))");
### Add some data
my $insert = $dbh->prepare("INSERT INTO tblData (qid,anid,value) VALUES (?,?,?)");
$insert->execute('axo', 1, 45);
$insert->execute('axo', 2, 12);
$insert->finish;
### Update data
my $insert_update = $dbh->prepare("REPLACE INTO tblData (qid,anid,value) VALUES (?,?,?)");
$insert_update->execute('axo', 2, 500);
$insert_update->execute('axo', 10, 500);
$insert_update->finish;
### Print out the data
my $select = $dbh->prepare("SELECT * FROM tblData ORDER BY 1,2");
$select->execute;
while (my #row = $select->fetchrow_array()) {
printf "Row: %s\n", join(" - ", #row);
}
$select->finish;
$dbh->disconnect;
exit 0;
Produces the following output, demonstrating the update of one row and the insert of another
Row: axo - 1 - 45
Row: axo - 2 - 500
Row: axo - 10 - 500
you can use the REPLACE command. Docs
In this particular situation it turns out that there's no easy solution - at least not that I've been able to find, despite the efforts of Mark to suggest one.
The solution I've ended up using might provide useful to someone else so I'm posting it here.
I've defined a multi-column primary key as QID + ANID and attempt to insert into the table. If that particular combination exists, it will produce an error code of 23000 which I can then use an an indicator that it should be an UPDATE instead. Here's the basic gist (in PHP):
try {
$DB->exec("INSERT INTO tblData (QID,ANID,value) VALUES ('xxx','x',1)");
}
catch(PDOException $e) {
if($e->getCode() == 23000) {
$DB->exec("UPDATE tblData SET value=value+1 WHERE ANID='x' AND QID='xxx'");
}
}
The actual code I've used is a bit more complex using prepare/placeholders and handling the error if the code isn't 23000, etc.

Resources