QSqlQuery GroupBy Slow - qt

i have a problem with QSqlQuery and GROUP BY.
My database is inMemory and is created like this:
query.exec("create table if not exists ProzSchnitte (id integer primary key autoincrement, Kanal uint, ProgNr uint, WkzNr uint, WkzBez blob, BearbNr uint, "
"Offset uint, Datum uint, Uhrzeit uint, count uint, SchnittHeader blob, xVals blob, yVals blob, Strom1 blob, Strom2 blob, Strom3 blob)");
I then writing my data into this table like this:
query.prepare(QString("INSERT INTO ProzSchnitte (Kanal, ProgNr, WkzNr, WkzBez, BearbNr, Offset, Datum, Uhrzeit, Count, SchnittHeader, xVals, yVals, Strom1, Strom2, Strom3) VALUES (%1, %2, %3, ?, %4, %5, %6,%7, %8, ?, ?, ?, ?, ?, ?) ")
.arg(TempSchnitt.Header.Kanal).arg(TempSchnitt.Header.Programmnummer).arg(TempSchnitt.Header.Werkzeugnummer)
.arg(TempSchnitt.Header.Bearbeitungsnummer).arg(TempSchnitt.Header.Offset).arg(TempSchnitt.Header.Datum)
.arg(TempSchnitt.Header.Uhrzeit).arg(impCount));
query.bindValue(1,HeaderArray);
query.bindValue(2,x);
query.bindValue(3,y);
query.bindValue(4,Strom1);
query.bindValue(5,Strom2);
query.bindValue(6,Strom3);
query.exec();
query.finish();
After that i am reading the data like this:
QSqlQuery q(QSqlDatabase::database("ProzAnaDB"));
q.prepare("select * from ProzSchnitte GROUP BY Kanal, ProgNr, Offset, WkzNr, BearbNr, Count order by Count, Kanal, ProgNr, Offset, BearbNr");
q.exec();
The Execution of this is extremly slow and takes about 8 seconds but if i make a select without GROUP BY it is executed in under 10 milliseconds.
I have read that i may have to set so indexes in my table but i have no clue what that means or how i could do that.
I made an EXPLAIN of my query and this was the result.
QSqlRecord( 8 )
" 0:" QSqlField("addr", int, generated: yes, typeID: 1) "0"
" 1:" QSqlField("opcode", QString, generated: yes, typeID: 3) ""
" 2:" QSqlField("p1", int, generated: yes, typeID: 1) "0"
" 3:" QSqlField("p2", int, generated: yes, typeID: 1) "0"
" 4:" QSqlField("p3", int, generated: yes, typeID: 1) "0"
" 5:" QSqlField("p4", QString, generated: yes, typeID: 3) ""
" 6:" QSqlField("p5", QString, generated: yes, typeID: 3) ""
" 7:" QSqlField("comment", , generated: yes, typeID: 5) ""
Can somebody explain me what i could do to boost the performance of this query?

The correct command to analyze a query is not EXPLAIN but EXPLAIN QUERY PLAN, and you should do this in the command-line shell or another database tool so that you can view the result more easily.
Anyway, the most useful index for this query is one that matches both the ORDER BY and the GROUP BY (so reorder the GROUP BY columns):
CREATE INDEX whatever ON ProzSchnitte(Count, Kanal, ProgNr, Offset, BearbNr);

Related

Snowflake ODBC and VARIANT data type

Using odbctest and Snowflake 64-bit ODBC driver for Windows:
Created a table in snowflake using this DDL:
CREATE TABLE "SFDEST"."QAUSER"."BT14726"
("VARCHAR_10_COL" VARCHAR (10),
"VARCHAR_4000_COL" VARCHAR (4000) ,
"CHAR_10_COL" CHAR (10) ,
"CLOB_COL" VARIANT,
"ROWID" CHAR (18) NOT NULL )
Then attempted to prepare an insert statement:
SQL attempted:
INSERT INTO "SFDEST"."QAUSER"."BT14726"
("VARCHAR_10_COL",
"VARCHAR_4000_COL",
"CHAR_10_COL",
"ROWID",
"CLOB_COL")
VALUES ( ?, ?, ?, ?, ?)
But this error was returned:
Prepare of destination insert statement failed. SQL compilation error:
Expression type does not match column data type, expecting VARIANT but
got VARCHAR(1) for column CLOB_COL
This is the relevant portion of odbc trace:
sqdrsvc 3dfc-52bc ENTER SQLPrepare
HSTMT 0x000000435C961620
UCHAR * 0x000000435D262720 [ 140] "INSERT INTO "SFDEST"."QAUSER"."BT14726" ("VARCHAR_10_COL",
"VARCHAR_4000_COL", "CHAR_10_COL", "ROWID", "CLOB_COL") VALUES ( ?,
?, ?, ?, ?) "
SDWORD 140
sqdrsvc 3dfc-52bc EXIT SQLPrepare with return code
-1 (SQL_ERROR)
HSTMT 0x000000435C961620
UCHAR * 0x000000435D262720 [ 140] "INSERT INTO "SFDEST"."QAUSER"."BT14726" ("VARCHAR_10_COL",
"VARCHAR_4000_COL", "CHAR_10_COL", "ROWID", "CLOB_COL") VALUES ( ?,
?, ?, ?, ?) "
SDWORD 140
DIAG [22000] SQL compilation error: Expression type does not match column data type, expecting VARIANT but got
VARCHAR(1) for column CLOB_COL (2023)
If you have a string that is formatted as a valid JSON blob, you need to use PARSE_JSON to convert it into an actual variant type so that SnowFlake can recognize it as such.
Probably something like this:
INSERT INTO "SFDEST"."QAUSER"."BT14726"
("VARCHAR_10_COL",
"VARCHAR_4000_COL",
"CHAR_10_COL",
"ROWID",
"CLOB_COL")
VALUES ( ?, ?, ?, ?, PARSE_JSON(?))

SQLITE - Sorting

I have this DB schema with 2 tables one for Athletes and one for Results.
I'm trying to get the last time elapse (or greater) of each athletes using this query:
Select Query
Select Athletes.BibNumber, Athletes.ChipNumber, Athletes.FirstName, Athletes.LastName, Athletes.Sex, Athletes.Category, count(Results.ElapsedTime) as Lapcount, Results.ElapsedTime
From Results, Athletes
Where Results.ChipNumber = Athletes.ChipNumber and Athletes.Category = 'A (Elite)' and Athletes.Sex = 'M' and Results.Active = 1
Group by Athletes.ChipNumber
Order by (Athletes.Sex = 'M') DESC, Athletes.Sex, Athletes.Category, Lapcount DESC, Results.ElapsedTime ASC;
This works ok if the times are added incrementally, but if I edit the time and add or change a time and the record ID is larger then the time the sort order is not applied.
Running the above query the result is:
"1" "2018001" "User" "2" "M" "A (Elite)" "5" "00:00:00.000"
"2" "2018002" "User" "1" "M" "A (Elite)" "5" "01:18:09.923"
But I would like to have:
"1" "2018001" "User" "2" "M" "A (Elite)" "5" "01:11:51.384"
"2" "2018002" "User" "1" "M" "A (Elite)" "5" "01:18:09.923"
DB Schema
CREATE TABLE IF NOT EXISTS `Results` (
`ID` INTEGER PRIMARY KEY AUTOINCREMENT,
`ChipNumber` TEXT,
`ReaderTime` TEXT,
`Antenna` TEXT,
`ElapsedTime` TEXT,
`Active` INTEGER DEFAULT 0
);
INSERT INTO `Results` (ID,ChipNumber,ReaderTime,Antenna,ElapsedTime,Active) VALUES
(72354,'2018002','2018/07/29 12:01:39.000','Gun','00:00:00.000',1),
(72383,'2018001','2018/07/29 12:19:07.975','S3','00:17:28.974',1),
(72386,'2018002','2018/07/29 12:19:51.877','S3','00:18:12.876',1),
(72411,'2018001','2018/07/29 12:36:49.677','S3','00:35:10.676',1),
(72415,'2018002','2018/07/29 12:39:29.232','S3','00:37:50.231',1),
(72433,'2018001','2018/07/29 12:55:08.811','S3','00:53:29.810',1),
(72439,'2018002','2018/07/29 12:59:37.760','M3','00:57:58.759',1),
(72452,'2018001','2018/07/29 13:13:30.385','S3','01:11:51.384',1),
(72456,'2018002','2018/07/29 13:19:48.923','Manual','01:18:09.923',1),
(72465,'2018001','2018/07/29 12:01:39.000','Gun','00:00:00.000',1);
CREATE TABLE IF NOT EXISTS `Athletes` (
`ID` INTEGER PRIMARY KEY AUTOINCREMENT,
`FirstName` TEXT,
`LastName` TEXT,
`Sex` TEXT DEFAULT 'M',
`Category` TEXT DEFAULT NULL,
`BibNumber` INTEGER DEFAULT 0,
`ChipNumber` TEXT DEFAULT 0,
`Active` BOOLEAN DEFAULT 0
);
INSERT INTO `Athletes` (ID,FirstName,LastName,Sex,Category,BibNumber,ChipNumber,Active) VALUES
(3,'User','1','M','A (Elite)',2,'2018002',1),
(29,'User','2','M','A (Elite)',1,'2018001',1);
I believe that your issue is due to the following (see highlighted) :-
If the SELECT statement is an aggregate query without a GROUP BY
clause, then each aggregate expression in the result-set is evaluated
once across the entire dataset. Each non-aggregate expression in the
result-set is evaluated once for an arbitrarily selected row of the
dataset. The same arbitrarily selected row is used for each
non-aggregate expression. Or, if the dataset contains zero rows, then
each non-aggregate expression is evaluated against a row consisting
entirely of NULL values.
SQL As Understood By SQLite - SELECT - 3. Generation of the set of result rows.
As such to ensure that you get the maximum value for the elapsed time you should use an aggregate function, thus max in your case.
Therefore, I believe the following will work for you :-
SELECT Athletes.BibNumber, Athletes.ChipNumber, Athletes.FirstName, Athletes.LastName, Athletes.Sex, Athletes.Category,
count(Results.ElapsedTime) AS Lapcount,
max(Results.ElapsedTime) AS ElapsedTime
FROM Results JOIN Athletes ON Results.ChipNumber = Athletes.ChipNumber
GROUP BY Athletes.ChipNumber
ORDER BY (Athletes.Sex = 'M') DESC, Athletes.Sex, Athletes.Category, Lapcount DESC, Results.ElapsedTime ASC;

Conversion failed when converting the varchar value '/DirectoryName/SubDirectoryName' to data type int

I have a stored procedure that I am calling from an ASP.NET page to get a list of files and some related fields. It fails when the dataset is filled and I am getting an error saying the conversion failed when converting the varchar value to data type int. The field is is referring to is a varchar with relative directory paths. I don't really have a clue why this is happening. An explanation and a solution for this problem would be greatly appreciated. My stored procedure is below.
ALTER proc [dbo].[spFileDownload]
(
#Folder varchar(1000) = null,
#Keyword varchar(1000) = null,
#BatchID IntegerListTable readonly,
#OrgID IntegerListTable readonly
)
as
if #Keyword is not null
begin
Select fldFileName,
fldRelativePathName,
fldDescription,
fldDateAdded,
fldKeywords,
fldBatchDescription
from tblReport
inner join tblBatchLog
on tblBatchLog.fldBatchID = tblReport.fldBatchID
where fldRelativePathName =ISNULL(#Folder, fldRelativePathName)
and Freetext(fldKeywords, #Keyword)
and tblreport.fldBatchID in (Select n from #BatchID)
and fldMembershipID in (select n from #OrgID)
end
else
begin
select fldFileName,
fldRelativePathName,
fldDescription,
fldDateAdded,
fldKeywords,
fldBatchDescription
from tblReport
inner join tblBatchLog
on tblBatchLog.fldBatchID = tblReport.fldBatchID
where fldRelativePathName =ISNULL(#Folder, fldRelativePathName)
and fldRelativePathName in (select n from #BatchID)
and fldMembershipID in (select n from #OrgID)
end
Edit: I failed at reading. Sorry.
and fldRelativePathName in (select n from #BatchID) aren't you comparing a string fldRelativePathName to ints ?
Look at this line...
and fldRelativePathName in (select n from #BatchID)
Column "n" is an integer so SQL is trying to coerce fldRelativePathName

First time SQLite3 user python 3, syntax error in print

I'm working on a program that is able to make quizzes by exporting questions into a database. I've looked on the internet for a bit and it said that one of the easiest ways to import or export to a database in python is to use the SQLite3 plugin, so I'm trying it out.This is the first time I've used the SQLite3 plugin with python, and I keep getting a syntax error on the self.connection.commit() in:
def AddQuestion(self, Question, Answer1, Answer2, Answer3, Answer4):
self.cursor.execute("""INSERT INTO questions
VALUES (?, ?, ?, ?, ?, ?)""", (None, Question, Answer1, Answer2, Answer3, Answer4, CorrectAnswer)
self.connection.commit()
If I were to turn it into a comment by adding # before it, it would tell me that the print in this was a syntax error:
print ("Would you like to make a test? Or would you like to take a test?")
Maybe its my indentation, or am I doing something wrong?
import squlite3
class QuestionStorage(object):
def _init_(self, path):
self.connection = sqlite3.connect(path)
self.cursor = self.connection.cursor ()
def Close(self):
self.cursor.close()
self.connection.close()
def CreateDb(self):
query = """CREATE TABLE questions
(id INTEGER PRIMARY KEY, Question TEXT, Answer1 TEXT, Answer2 TEXT, Answer3 TEXT, Answer4 TEXT, CorrectAnswer TEXT)"""
self.cursor.exeute(query)
self.connection.commit()
#self.cursor.close()
def AddQuestion(self, Question, Answer1, Answer2, Answer3, Answer4):
self.cursor.execute("""INSERT INTO questions
VALUES (?, ?, ?, ?, ?, ?)""", (None, Question, Answer1, Answer2, Answer3, Answer4, CorrectAnswer)
self.connection.commit()
def GetQuestion(self, index = None):
self.cursor.execute("""SELECT * FROM questions WEHRE id=?""", (index,))
print ("Would you like to make a test? Or would you like to take a test?")
testTaker = input ("To create a test, type Create. To take a test, type Take.")
if testTaker == "Create":
testName = input ("Give your test a name.")
testQ = int(input ("How many questions will be on this test? (Numeric value only.)"))
testType = input ("Will this test be multiple choice? (y/n)")
if testType == "N" or "n":
counter = 1
qs = QuestionStorage("questions.db")
qs.CreateDb()
counter = 1
while counter >= testQ:
Answer = []
Question = input ("What is your question?")
Answer[1] = input ("What is the first answer?")
Answer[2] = input ("What is the second answer?")
Answer[3] = input ("What is the third answer?")
Answer[4] = input ("What is your last answer?")
correctAnswer = input("Which answer is the correct answer? (1, 2, 3, or 4?)")
Answer[5] = Answer[correctAnswer]
qs.AddQuestion(Question, Answer[1] , Answer[2], Answer[3], Answer[4], Answer[5])
counter +=1
else:
and then after the else, I'd have the code for reading the database to take a test.
If anyone can help me out with this, that would be great. Right now I'm just trying to get it to the point where I can run it in debug.
You forgot to close the parentheses here:
self.cursor.execute("""INSERT INTO questions
VALUES (?, ?, ?, ?, ?, ?)""", (None, Question, Answer1, Answer2, Answer3, Answer4, CorrectAnswer)
Put another closing ) at the end of the line.

How To Create A Stored Procedure That return a Random Number If Not Exist In The Table

I Want To Create A Stored procedure That return A Random Number Between (11111,99999)
Provided that the Number Should Not Exist In The Table
I use This complicated Function to Do that But I Need To Convert it to Stored Procedure
Function GiveRandomStudentNumber() As String
s:
Dim rnd As New Random
Dim st_num As String = rnd.Next(11111, 99999)
Dim cmd As New SqlCommand("select count(0) from student where st_num = " & st_num,con)
dd.con.Open()
Dim count As Integer = cmd.ExecuteScalar()
dd.con.Close()
If count <> 0 Then
GoTo s
Else
Return st_num
End If
End Function
this Function Is Works But I need To Convert it To Stored Procedure ..
Thanks In Advance ...
CREATE PROCEDURE [dbo].[Select_RandomNumber]
(
#Lower INT, --11111-- The lowest random number
#Upper INT --99999-- The highest random number
)
AS
BEGIN
IF NOT (#Lower < #Upper) RETURN -1
--TODO: If all the numbers between Lower and Upper are in the table,
--you should return from here
--RETURN -2
DECLARE #Random INT;
SELECT #Random = ROUND(((#Upper - #Lower -1) * RAND() + #Lower), 0)
WHILE EXISTS (SELECT * FROM YourTable WHERE randCol = #Random)
BEGIN
SELECT #Random = ROUND(((#Upper - #Lower -1) * RAND() + #Lower), 0)
END
SELECT #Random
END
Create a table of student IDs. Fill it up with IDs between X and Y. Every time you want to use an ID, remove it from the table.
create table [FreeIDs] (
[ID] int,
[order] uniqueidentifier not null default newid() primary key);
insert into [FreeIDs] ([ID]) values (11111),(11112),...,(99999);
to get a free ID:
with cte as (
select top(1) [ID]
from [FreeIDs]
order by [order])
delete cte
output deleted.ID;
The persisted predeterminer order speeds up generating new IDs.
BTW, if you're tempted to 'optimize' the table and go by a numbers table:
with Digits as (
select Digit
from (
values (0), (1), (2), (3), (4), (5),
(6), (7), (8), (9)) as t(Digit)),
Numbers as (
select u.Digit + t.Digit*10 +h.Digit*100 + m.Digit*1000+tm.Digit*10000 as Number
from Digits u
cross join Digits t
cross join Digits h
cross join Digits m
cross join Digits tm)
select top(1) Number
from Numbers
where Number between 11111 and 99999
and Number not in (
select ID
from Students)
order by (newid());
just don't. The requirement to randomize the set is a performance killer and the join to eliminate existing (used) IDs is also problematic. But most importantly the solution fails under concurrency, as multiple requests can get the same ID (and this increases as the number of free IDs is reduced). And of course, the semantically equivalent naive row-by-painfully-slow-row processing, like your original code or Kaf's answer, have exactly the same problem but are also just plain slow. It really worth testing the solution when all but one of the IDs are taken, watch the light dim as you wait for the random number generator to hit the jackpot...

Resources