This query worked perfectly until the moment I went in for vacations, now itdoes not run anymore and does not merge, dont know what it can be
MERGE INTO STG_FATO_MACRO_GESTAO AS FAT
USING(SELECT DISTINCT
COD_EMPRESA
,FUN.MATRICULA AS FUN_MAT
,APR.MATRICULA AS APR_MAT
,FUN.CPF AS FUN_CPF
,APR.CPF AS APR_CPF
,APR.DAT_DESLIGAMENTO
,YEAR(APR.DAT_DESLIGAMENTO)*100+MONTH(APR.DAT_DESLIGAMENTO) AS DESL
,FUN.DATA_ADMISSAO
,YEAR(FUN.DATA_ADMISSAO)*100+MONTH(FUN.DATA_ADMISSAO) AS ADM
, CASE WHEN YEAR(APR.DAT_DESLIGAMENTO)*100+MONTH(APR.DAT_DESLIGAMENTO) <= YEAR(FUN.DATA_ADMISSAO)*100+MONTH(FUN.DATA_ADMISSAO) THEN 1 ELSE 0 END AS ADMITIDO
,CASE WHEN FUN.DATA_ADMISSAO <= (APR.DAT_DESLIGAMENTO + INTERVAL '90' DAY) THEN 1 ELSE 0 END AS APR_90
FROM (SELECT CPF,DATA_ADMISSAO, MATRICULA, COD_EMPRESA FROM DIM_FUNCIONARIO
WHERE PROFISSAO NOT LIKE '%APRENDIZ%') AS FUN
INNER JOIN (SELECT DISTINCT
CPF,DAT_DESLIGAMENTO,MATRICULA
FROM HST_APRENDIZ
WHERE FLAG_FECHAMENTO = 2
AND DAT_DESLIGAMENTO IS NOT NULL) AS APR
ON FUN.CPF = APR.CPF) AS APR_90
ON FAT.COD_EMPRESA = APR_90.COD_EMPRESA
AND FAT.MATRICULA = APR_90.FUN_MAT
AND APR_90.APR_90 = 1
AND APR_90.ADMITIDO = 1
WHEN MATCHED THEN
UPDATE SET APRENDIZ_EFETIVADO_90 = 1
;
when running this query returns me this error:
"The search condition must fully specify the Target table primary index and partition column(s) and expression must match INSERT specification primary index and partition column(s). "
Related
I am having trouble with a query.
Fiddle: https://www.db-fiddle.com/f/JXQHw1VzF7vAowNLFrxv5/1
This is not going to work.
So my question is: What has to be done to get a result when I wanna use both conditions.
(attr_key = 0 AND attr_value & 201326592 = 201326592)
AND
(attr_key = 30 AND attr_value & 8 = 8)
Thanks in advance!
Best regards
One way to check for the presence of some number of key value pairs in the items_attributes table would be to use conditional aggregation:
SELECT i.id
FROM items i
LEFT JOIN items_attributes ia
ON i.id = ia.owner
GROUP BY
i.id
HAVING
SUM(CASE WHEN ia.key = 0 AND ia.value = 201326592 THEN 1 ELSE 0 END) > 0 AND
SUM(CASE WHEN ia.key = 30 AND ia.value = 8 THEN 1 ELSE 0 END) > 0
The trick in the above query is that we scan each cluster of key/value pairs for each item, and then check whether the pairs you expect are present.
Note: My query just returns id values from items matching all key value pairs. If you want to bring in other columns from either of the two tables, you may simply add on more joins to what I wrote above.
Trying to optimize a query, which has multiple counts for objects in subordinate table (used aliases in SQLAlchemy). In Witch Academia terms, something like this:
SELECT
exam.id AS exam_id,
exam.name AS exam_name,
count(tried_witch.id) AS tried,
count(passed_witch.id) AS passed,
count(failed_witch.id) AS failed
FROM exam
LEFT OUTER JOIN witch AS tried_witch
ON tried_witch.exam_id = exam.id AND
tried_witch.is_failed = 0 AND
tried_witch.status != "passed"
LEFT OUTER JOIN witch AS passed_witch
ON passed_witch.exam_id = exam.id AND
passed_witch.is_failed = 0 AND
passed_witch.status = "passed"
LEFT OUTER JOIN witch AS failed_witch
ON failed_witch.exam_id = exam.id AND
failed_witch.is_failed = 1
GROUP BY exam.id, exam.name
ORDER BY tried ASC
LIMIT 20
Number of witches can be large (hundreds of thousands), number of exams is lower (hundreds), so the above query is quite slow. In a lot of similar questions I've found answers, which propose the above, but I feel like a totally different approach is needed here. I am stuck at coming up with alternative. NB, there is a need to order by calculated counts. It is also important to have zeros as counts, of course, where due. (do not pay attention to a somewhat funny model: witches can easily clone themselves to go to multiple exams, thus per exam identity)
With one EXISTS subquery, which is not reflected in the above and does not influence the ouotcome, the situation is:
# Query_time: 1.135747 Lock_time: 0.000209 Rows_sent: 20 Rows_examined: 98174
# Rows_affected: 0
# Full_scan: Yes Full_join: No Tmp_table: Yes Tmp_table_on_disk: Yes
# Filesort: Yes Filesort_on_disk: No Merge_passes: 0 Priority_queue: No
Updated query, which is still quite slow:
SELECT
exam.id AS exam_id,
exam.name AS exam_name,
count(CASE WHEN (witch.status != "passed" AND witch.is_failed = 0)
THEN witch.id
ELSE NULL END) AS tried,
count(CASE WHEN (witch.status = "passed" AND witch.is_failed = 0)
THEN witch.id
ELSE NULL END) AS passed,
count(CASE WHEN (witch.is_failed = 1)
THEN witch.id
ELSE NULL END) AS failed
FROM exam
LEFT OUTER JOIN witch ON witch.exam_id = exam.id
GROUP BY exam.id, exam.name
ORDER BY tried ASC
LIMIT 20
Indexing is the key to get performance of the query.
I do not know MariaDB at all, so not sure what the possibilities are. But if it is anything like Microsoft SQL Server, then here is what I would try:
Create ONE composite index covering ALL the required columns: witch_id, status and is_failed. If the query uses that index, that should be it. Here the order of the included columns might be very important. Then profile the query in order to understand if the index is used. See Optimization and Indexes documentation page.
Consider Generated (Virtual and Persistent) Columns.
It looks like all the information for classification of the witch into tried, passed or failed bucket is contained in the row for witch. Therefore, you can basically create those virtual columns on the database table directly and use PERSISTENT option. This option allows creating index on it. Then you can create an index specifically for this query containing witch_id and three virtual columns: tried, passed and failed. Make sure you query uses it, and that should be pretty good. The query will then look very simple:
SELECT exam.id,
exam.name,
sum(witch.tried) AS tried,
sum(witch.passed) AS passed,
sum(witch.failed) AS failed
FROM exam
INNER JOIN witch ON exam.id = witch.exam_id
GROUP BY exam.id,
exam.name
ORDER BY sum(witch.tried)
LIMIT 20
Although query simple comparisons and AND/OR clauses, you are basically offloading the calculation of the 3 statuses to the database during INSERT/UPDATE. Then during SELECT you query should be much faster.
Your example does not specify any result filtering (WHERE clause), but if you have one, it might also have an impact on the way one optimises indices for query performance.
Original answer: Below is the originally proposed change to the query.
Here i assume that indexing part of the optimisation has been already done.
Could you try with SUM instead of COUNT?
SELECT exam.id,
exam.name,
sum(CASE
WHEN (witch.is_failed = 0
AND witch.status != 'passed') THEN 1
ELSE 0
END) AS tried,
sum(CASE
WHEN (witch.is_failed = 0
AND witch.status = 'passed') THEN 1
ELSE 0
END) AS passed,
sum(CASE
WHEN (witch.is_failed = 1) THEN 1
ELSE 0
END) AS failed
FROM exam
INNER JOIN witch ON exam.id = witch.exam_id
GROUP BY exam.id,
exam.name
ORDER BY sum(CASE
WHEN (witch.is_failed = 0
AND witch.status != 'passed') THEN 1
ELSE 0
END)
LIMIT 20
The rest:
Given you have specified sqlalchemy in your answer, here is the sqlalchemy code, which i used to model and generate the query:
# model
class Exam(Base):
id = Column(Integer, primary_key=True)
name = Column(String)
class Witch(Base):
id = Column(Integer, primary_key=True)
exam_id = Column(Integer, ForeignKey('exam.id'))
is_failed = Column(Integer)
status = Column(String)
exam = relationship(Exam, backref='witches')
# computed fields
#hybrid_property
def tried(self):
return self.is_failed == 0 and self.status != 'passed'
#hybrid_property
def passed(self):
return self.is_failed == 0 and self.status == 'passed'
#hybrid_property
def failed(self):
return self.is_failed == 1
# computed fields: expression
#tried.expression
def _tried_expression(cls):
return case([(and_(
cls.is_failed == 0,
cls.status != 'passed',
), 1)], else_=0)
#passed.expression
def _passed_expression(cls):
return case([(and_(
cls.status == 'passed',
cls.is_failed == 0,
), 1)], else_=0)
#failed.expression
def _failed_expression(cls):
return case([(cls.is_failed == 1, 1)], else_=0)
and:
# query
q = (
session.query(
Exam.id, Exam.name,
func.sum(Witch.tried).label("tried"),
func.sum(Witch.passed).label("passed"),
func.sum(Witch.failed).label("failed"),
)
.join(Witch)
.group_by(Exam.id, Exam.name)
.order_by(func.sum(Witch.tried))
.limit(20)
)
I am using the following insert query to create a comparison between two tables using the dates to join on.
INSERT INTO Comp_Table (Date, CKROne, CKRTwo, ChangeOne, ChangeTwo, State)
SELECT BaseTbl.Date, BaseTbl.CKR, CompTbl.CKR, BaseTbl.Change, CompTbl.Change,
CASE
WHEN BaseTbl.Change > 0 AND CompTbl.Change > 0 THEN 'positive'
WHEN BaseTbl.Change < 0 AND CompTbl.Change < 0 THEN 'positive'
ELSE 'inversely'
END AS 'Correlation'
FROM BaseTbl
JOIN CompTbl ON BaseTbl.Date = CompTbl.Date;
This works well. However, I would like to be able to join the tables with a lag. As in, the user can define if they want to do exact match on dates or if they want to use a date of one's occurrence plus a number and return the value from the latter date for comparison to the number to the former date. Pseudo code example:
User sets variable = 0 then
Join ComTbl On BaseTbl.Date = CompTbl.Date + 0;
User sets variable = 7 then
Join CompTbl On BaseTbl.Date = CompTbl.Date + 7;
(joins 2012-01-01 from BaseTbl to 2012-01-08 from CompTbl)
I tried to add days like you would in a Where clause ('+7 day'), but this didn't work. I also tried to using a Where clause with BaseTbl.Date = CompTbl.Date '+ 7 day' but that returned a 0 value also. How can this be accomplished in SQLite?
I think you can use the DATE() function to build the WHERE clause you want:
INSERT INTO ...
SELECT ...
FROM BaseTbl
INNER JOIN ComTbl
ON BaseTbl.Date = DATE(CompTbl.Date, '7 days')
I would like to utilize the Month Column in the below syntax in a Case Statement. When I create a sub query I receive the Oracle error 01788 Connect By Clause Required in query block. How can one utilize the Month column in the case statment in the subquery?
TO_CHAR(ADD_MONTHS(TRUNC(StartDate, 'MM'), LEVEL - 1), 'YYYYMM') AS Month
Query below:
SELECT
CASE
WHEN first_assgn_dt_YYYYMM <= Month
THEN 0
WHEN EndDate < LAST_DAY(EndDate) AND EndDate != sysdate
AND LEVEL = 1 + MONTHS_BETWEEN(TRUNC(EndDate,'MM'),TRUNC(StartDate,'MM'))
THEN 0
ELSE 1
END AS active_at_month_end
FROM (
WITH
ActiveMemberData (ID,StartDate,EndDate,first_assgn_dt,first_assgn_dt_YYYYMM) AS (
SELECT DISTINCT
x.ID,
TRUNC(x.start_dt) AS StartDate,
CASE WHEN TRUNC(X.END_DT) = '1-JAN-3000' THEN SYSDATE ELSE TO_DATE(X.END_DT) END AS EndDate,
x.first_assgn_dt,
TO_CHAR(first_assgn_dt,'YYYYMM') AS first_assgn_dt_YYYYMM
FROM X
LEFT JOIN D ON X.MID = D.ID
WHERE 1=1
)
--------------------------------------------------
SELECT DISTINCT
ID,
first_assgn_dt,
first_assgn_dt_YYYYMM,
StartDate,
TO_CHAR(StartDate,'YYYYMM') AS StartDate_YYYYMM,
EndDate,
TO_CHAR(ADD_MONTHS(TRUNC(StartDate, 'MM'), LEVEL - 1), 'YYYYMM') AS Month,
LAST_DAY(EndDate) AS LastDayOfMonth
FROM ActiveMemberData
WHERE 1=1
------------------------------------------------------------------------------------
CONNECT BY LEVEL <= 1 + MONTHS_BETWEEN(TRUNC(EndDate,'MM'), TRUNC(StartDate,'MM'))
AND PRIOR ID = ID AND PRIOR STARTDATE = STARTDATE
AND PRIOR sys_guid() IS NOT NULL
) Z
WHERE 1=1
ORDER BY
ID,
Month
That has nothing to do with trying to refer to Month from the inline view; that is fine. It's the separate reference to level that is causing the error.
If you want to be able to see the level from your inline view in the outer query, as you are with this line:
AND LEVEL = 1 + MONTHS_BETWEEN(TRUNC(EndDate,'MM'),TRUNC(StartDate,'MM'))
then you have to include it in the select list - with an alias - and then refer to that alias:
SELECT
CASE
...
AND LEVEL_ALIAS = 1 + MONTHS_BETWEEN(TRUNC(EndDate,'MM'),TRUNC(StartDate,'MM'))
...
FROM (
...
SELECT DISTINCT
LEVEL as LEVEL_ALIAS,
ID,
...
You can call the alias whatever you want, of course; you just can't use the reserved word level.
Anything you want visible in the outer query always has to be in the inline view's select list - but usually you can keep the original column name; you have to use an alias for an expression or a pseucocolumn though, which is the case here.
You don't have to use an alias for the reserved word. Just add double quotes and capitilise it i.e. "LEVEL"
I'm trying to verify if data exists in two different tables in a single transaction. The reason for the single transaction is the database gets hit about 1-3 million times a day so adding anymore than 1 extra transaction would increase that number up to 9 million, and my poor little server needs a break :)
So I need to check if an ID exists in table X and table Y and return the results to my VB.net script so I can handle the outcome Ideally something like this would work
if exists (select id from X where id = #id)
print 'True,' else print 'False,'
if exists (select id from Y where id = #id)
print 'True' else print 'False'
Which gives me "True, True" if exists in both or "True, False" etc etc... But that only displays in SQL print and not actually returning it as an object/string or array values that I can use.
I'm open to any sort of solution of this nature that can give me two results from a single transaction and how to handle that response in vb. Thanks
SELECT
Case When EXISTS(SELECT 1 FROM X WHERE id = #id) Then 1 Else 0 End AS IsInX,
Case When EXISTS(SELECT 1 FROM Y WHERE id = #id) Then 1 Else 0 End AS IsInY
select (select COUNT(*) from X where id = #id) AS x_exists,
(select COUNT(*) from Y where id = #id) AS y_exists
This returns one data row with two fields, each containing either 0 or 1 (or more, if id is not unique).
CREATE PROCEDURE CheckIDOnTables(#ID int)
AS
BEGIN
DECLARE #X AS NVARCHAR(10)
DECLARE #Y AS NVARCHAR(10)
Set #X = 'False'
Set #Y = 'False'
if exists (select id from TableX where id = #ID)
Set #X = 'True'
if exists (select id from TableY where id = #ID)
Set #Y = 'True'
SELECT #X AS XExists, #Y AS YEsists
END
It will give you your desired results.