R SQLDF PIVOT FOR A DATA FRAME - r

I have a data.frame named as data6 on which i want to perform pivot.
i have tried sqldf but after checking thoroughly still i couldn't find why this error is coming
"Error in result_create(conn#ptr, statement) : near "(": syntax error"
sql statement: "Select * From (Select * From data6) PIVOT(sum([Qty]) FOR [GRNUOM] in ([con2],[con3],[con4],[Con15],[Con2],[Con3],[Con4])"
Using recast/dcast function it can be done but as a new user i don't know how to do it. even i don't know what is difference between recast & dcast.
I know it's a lot of questions but if anyone help me out on this

The sqldf package by default uses SQLite as the underlying database, and therefore the PIVOT syntax you are using won't work. While you might be able to change the underlying database to SQL Server, you could also just use an ANSI standard pivot query:
sql <- "SELECT \"Style Name\",
SUM(CASE WHEN GRNUOM = 'con2' THEN Qty ELSE 0 END) AS con2,
SUM(CASE WHEN GRNUOM = 'con3' THEN Qty ELSE 0 END) AS con3,
SUM(CASE WHEN GRNUOM = 'con4' THEN Qty ELSE 0 END) AS con4,
SUM(CASE WHEN GRNUOM = 'Con15' THEN Qty ELSE 0 END) AS Con15,
SUM(CASE WHEN GRNUOM = 'Con2' THEN Qty ELSE 0 END) AS Con2,
SUM(CASE WHEN GRNUOM = 'Con3' THEN Qty ELSE 0 END) AS Con3,
SUM(CASE WHEN GRNUOM = 'Con4' THEN Qty ELSE 0 END) AS Con4
FROM data6
GROUP BY \"Style Name\""
result <- sqldf(sql)
I am not certain if con2 and Con2 would actually be recognized as separate column names. You might have to put all column names in double quotes to make the above query work the way you expect.

Related

I can't run this query with MERGE in Teradata

This query worked perfectly until the moment I went in for vacations, now itdoes not run anymore and does not merge, dont know what it can be
MERGE INTO STG_FATO_MACRO_GESTAO AS FAT
USING(SELECT DISTINCT
COD_EMPRESA
,FUN.MATRICULA AS FUN_MAT
,APR.MATRICULA AS APR_MAT
,FUN.CPF AS FUN_CPF
,APR.CPF AS APR_CPF
,APR.DAT_DESLIGAMENTO
,YEAR(APR.DAT_DESLIGAMENTO)*100+MONTH(APR.DAT_DESLIGAMENTO) AS DESL
,FUN.DATA_ADMISSAO
,YEAR(FUN.DATA_ADMISSAO)*100+MONTH(FUN.DATA_ADMISSAO) AS ADM
, CASE WHEN YEAR(APR.DAT_DESLIGAMENTO)*100+MONTH(APR.DAT_DESLIGAMENTO) <= YEAR(FUN.DATA_ADMISSAO)*100+MONTH(FUN.DATA_ADMISSAO) THEN 1 ELSE 0 END AS ADMITIDO
,CASE WHEN FUN.DATA_ADMISSAO <= (APR.DAT_DESLIGAMENTO + INTERVAL '90' DAY) THEN 1 ELSE 0 END AS APR_90
FROM (SELECT CPF,DATA_ADMISSAO, MATRICULA, COD_EMPRESA FROM DIM_FUNCIONARIO
WHERE PROFISSAO NOT LIKE '%APRENDIZ%') AS FUN
INNER JOIN (SELECT DISTINCT
CPF,DAT_DESLIGAMENTO,MATRICULA
FROM HST_APRENDIZ
WHERE FLAG_FECHAMENTO = 2
AND DAT_DESLIGAMENTO IS NOT NULL) AS APR
ON FUN.CPF = APR.CPF) AS APR_90
ON FAT.COD_EMPRESA = APR_90.COD_EMPRESA
AND FAT.MATRICULA = APR_90.FUN_MAT
AND APR_90.APR_90 = 1
AND APR_90.ADMITIDO = 1
WHEN MATCHED THEN
UPDATE SET APRENDIZ_EFETIVADO_90 = 1
;
when running this query returns me this error:
"The search condition must fully specify the Target table primary index and partition column(s) and expression must match INSERT specification primary index and partition column(s). "

two counts in the same row

is't possible to set 2 counts in the same row.
my result from query is like this:
enter image description here
and i will that the end result seem like this :
enter image description here
and at the end build the precent count1 to count2
my attempt trough case was not successful : SELECT Date,Shift , CASE description WHEN 'Defects' THEN count ELSE 0 END AS Defect_Count , CASE description WHEN 'Total' THEN count ELSE 0 END AS Total_Count FROM ("Queries union)
Here you go. Hope this helps. Thanks.
MYSQL:
select
t.dates, t.shift,
sum(case when t.description = 'Defects' then t.counts else 0 end) as `Defects`,
sum(case when t.description = 'Total' then t.counts else 0 end) as `Total`
from (
select *
from tbl ) t
group by t.dates, t.shift
order by t.dates, t.shift
ORACLE:
SELECT dates, shift, defects , total
FROM
(
SELECT *
FROM tbl
)
PIVOT
(
sum(counts)
FOR description IN ('Defects' as defects, 'Total' as total)
)
ORDER BY dates
Result:
dates shift Defects Total
2018-01-20 AM 21 56
2018-01-20 PM 19 54
2018-01-23 AM 16 58
2018-01-23 PM 20 45
many Thanks is working for the first Step (counts in the same Row).
i will try now to build the percent (Defects to Total).
Thanks.
to build the percent (defects to Total):
select dates,shift,defects,total,round((100*defects/total),2) Percent2Total from(select t.dates, t.shift,
sum(case when t.description = 'Defects' then t.counts else 0 end) as 'Defects',
sum(case when t.description = 'total' then t.counts else 0 end) as 'Total'
from (
select *
from tbl ) t
group by t.dates, t.shift
)q order by dates,Shift.
may be it's possible to build that only with Pivot or?

SELECT Failed. 3504: Selected non-aggregate values must be part of the associated group in v14.10.05.09 and not in v14.00.07.02

There are totally 9 fields in the select clause of the query where 5 are direct fields. 2 of them are result of select query in the select clause and 2 more are from using the result from previous 2 fields used in case when statements. The SUM aggregate is applied on the last two columns.
All the first 5 columns are listed in the group by clause.
Query :
SELECT
sub.a,sub.b,sub.c,Table1.d,Table1.e,
(select x from table t1 where y=current_date) as f,
(select z from table t1 where y=w) as g,
sum(case when f=g then 1 else 0)) as h,
sum(case when f+1=g then 1 else 0)) as i
FROM Table1
LEFT JOIN
(SELECT a,b,c,l,m,n,o FROM Table2 INNER JOIN Table3 ON Table2.p=Table3.t)sub
ON Table1.e = sub.l
GROUP BY sub.a,sub.b,sub.c,Table1.d,Table1.e
The above query which works perfectly fine in Teradata v14.10.05.09 but fails with
"SELECT Failed. 3504: Selected non-aggregate values must be part of
the associated group"
in Teradata v14.00.07.02.
Is this because of something introduced in the newer version that is causing the query to fail but run fine in older version?
Or am I missing something?
Thanks in Advance.
This is the actual query that is giving the problem: SELECT
sub.key,sub.div,sub.reg_nm,tb_cal.calendar_yr,tb_cal.calendar_mth,
(SEL fiscal_yr FROM db1.tb_cal tb_cal WHERE calendar_dt=CURRENT_DATE) current_fy,
(SEL fiscal_yr FROM db1.tb_cal tb_cal WHERE calendar_dt=clse_dt) clse_dt_fy,
SUM(CASE WHEN current_fy=clse_dt_fy THEN sub.amt ELSE 0 END) cnt
FROM db1.tb_cal tb_cal
LEFT JOIN( SELECT
tbl_a.key,tbl_a.stage,COALESCE(tbl_u.div, 'Un-assigned') div,COALESCE(tbl_u.reg_nm,'Un-assigned') reg_nm,tbl_a.clse_dt,tbl_a.amt
FROM db_a.tbl_a tbl_a
INNER JOIN db1.tbl_u tbl_u
ON tbl_u.unit_key=tbl_a.unit_key
WHERE tbl_a.ctg IN ('G','O','C','F','I') AND tbl_a.stage NOT IN ('R', 'D', 'N','A') AND tbl_a.ind = 0
)sub
ON sub.clse_dt = tb_cal.calendar_dt
GROUP BY sub.key,sub.div,sub.reg_nm,tb_cal.calendar_yr,tb_cal.calendar_mth
SELECT
sub.key,
sub.div,
sub.reg_nm,
tb_cal.calendar_yr,
tb_cal.calendar_mth,
(SEL fiscal_yr FROM db1.tb_cal tb_cal WHERE calendar_dt=CURRENT_DATE) current_fy,
(SEL fiscal_yr FROM db1.tb_cal tb_cal WHERE calendar_dt=clse_dt) clse_dt_fy,
SUM(CASE WHEN current_fy=clse_dt_fy THEN sub.amt ELSE 0 END) cnt
FROM db1.tb_cal tb_cal
LEFT JOIN( SELECT
tbl_a.key,tbl_a.stage,COALESCE(tbl_u.div, 'Un-assigned') div,COALESCE(tbl_u.reg_nm,'Un-assigned') reg_nm,tbl_a.clse_dt,tbl_a.amt
FROM db_a.tbl_a tbl_a
INNER JOIN db1.tbl_u tbl_u
ON tbl_u.unit_key=tbl_a.unit_key
WHERE tbl_a.ctg IN ('G','O','C','F','I') AND tbl_a.stage NOT IN ('R', 'D', 'N','A') AND tbl_a.ind = 0
) sub
ON sub.clse_dt = tb_cal.calendar_dt
GROUP BY sub.key,sub.div,sub.reg_nm,tb_cal.calendar_yr,tb_cal.calendar_mth,current_fy,clse_dt_fy;
This should work fine. But as per your comments, you said adding the last to columns in group by would alter your required answer. If that is the case, you can post what is your desired result set. We can help you out in that. Although looking at your query it doesn't seem that adding current_fy and clse_dt_fy will change your current result set.

Convert Columns to ROW

I am stuck with this requirement -
I have some data in the format
(Entries now show data for both periods (Jan. 2011) and (Feb. 2011) on the same line as apposed to appearing separately).
At the end I need to print the data using dbms_output.put_line command.
I am using Oracle 10.2g.
Oracle 10g does not have a PIVOT function but you can convert the rows of data into columns using an aggregate function with a CASE expression. The basic syntax would be:
select d.id,
d.site,
d.entrance,
sum(case when d.date = 'Jan.2011' then enters else 0 end) "Jan.2011",
sum(case when d.date = 'Feb.2011' then enters else 0 end) "Feb.2011"
from
(
select id, site, entrance, date, enters
from yourdata
) d
group by d.id, d.site, d.entrance;
Note: you can replace the subquery with yourdata with your current query.

Count if function in nested sql query

I have following query in sql server 2005 with a table
select t1.id, CONVERT(VARCHAR,t1.dt,103) date_1, CONVERT(VARCHAR,t2.dt,103) date_2, t1.hotel,
t1.price price_1, t2.price price_2, t2.price - t1.price difference, ((t2.price - t1.price)/t1.price)*100 as Diff_percentage
from test t1
inner join
(
select *
from test
) t2
on t1.hotel = t2.hotel
and t1.dt < t2.dt and t2.dt=(SELECT MAX(dt) from TEST) and t1.dt=(SELECT MAX(dt-1) from TEST)
I want to use count if function within this query. Based on difference column. So that i can count "How many increased, How many decreased, How many same, How many unavailable"
COUNT IF Difference>0 //How many increased
COUNT IF Difference<0 //How many decreased
COUNT IF Difference=0 //How many same
COUNT IF Difference="" //How many unavailable --Difference is blank.
DEMO : http://sqlfiddle.com/#!3/b6f37/29
If I wanted your existing query result I would rewrite the query as:
select t1.id,
CONVERT(VARCHAR,t1.dt,103) date_1,
CONVERT(VARCHAR,t2.dt,103) date_2,
t1.hotel,
t1.price price_1,
t2.price price_2,
t2.price - t1.price difference,
((t2.price - t1.price)/t1.price)*100 as Diff_percentage
from test t1
join (select max(dt) maxDt from test) d
on t1.dt = d.maxDt-1
join test t2
on t2.hotel = t1.hotel
and t2.dt = d.maxDt
To extend your original query to include hotels with missing rows:
select t1.id,
CONVERT(VARCHAR,t1.dt,103) date_1,
CONVERT(VARCHAR,t2.dt,103) date_2,
h.hotel,
t1.price price_1,
t2.price price_2,
t2.price - t1.price difference,
((t2.price - t1.price)/t1.price)*100 as Diff_percentage
from (select distinct hotel from test) h
cross join (select max(dt) maxDt from test) d
left join test t1
on t1.hotel = h.hotel
and t1.dt = d.maxDt-1
left join test t2
on t2.hotel = h.hotel
and t2.dt = d.maxDt
The above query would be much more efficient with a normalized HOTEL table (1 row per hotel) to replace the SELECT DISTINCT subquery.
To get your requested result, I would use:
select count(case when (t2.price-t1.price) < 0 then 1 end) decrease_count,
count(case when (t2.price-t1.price) > 0 then 1 end) increase_count,
count(case when (t2.price-t1.price) = 0 then 1 end) same_count,
count(distinct t1.hotel) - count(case when (t2.price-t1.price) is not null then 1 end) unavailable_count
from test t1
left join (select max(dt) maxDt from test) d
on t1.dt = d.maxDt-1
left join test t2
on t2.hotel = t1.hotel
and t2.dt = d.maxDt
Again the above would be more efficient with a normalized HOTEL table. I would restructure the query more like the previous one: select from HOTEL cross joined to the MAX date query, then outer join to the TEST table twice for the data for the 2 dates. The unavailable count could then be measured more directly, counting the number of rows where the difference computation is NULL.
Here is the SQL Fiddle for all the queries, along with some extended test data.
This approach uses your query and then just summarizes the results:
with t as (
select t1.id, CONVERT(VARCHAR,t1.dt,103) as date_1,
CONVERT(VARCHAR,t2.dt,103) as date_2,
t1.hotel,
t1.price as price_1, t2.price as price_2,
t2.price - t1.price as difference,
((t2.price - t1.price)/t1.price)*100 as Diff_percentage
from test t1 join
test t2
on t1.hotel = t2.hotel and
t1.dt < t2.dt and
t2.dt=(SELECT MAX(dt) from TEST) and
t1.dt=(SELECT MAX(dt-1) from TEST)
)
select sum(case when diff_percentage > 0.0 then 1 else 0 end) as numIncrease,
sum(case when diff_percentage < 0.0 then 1 else 0 end) as numDecrease,
sum(case when diff_percentage = 0.0 then 1 else 0 end) as numSame,
sum(case when diff_percentage is NULL then 1 else 0 end) as numBlank
from t
I'm not sure what "dt - 1" means. With date/datetime values in SQL Server, one usually uses "dateadd(day, -1, )" to subtract a date. In any case, there may be other ways to calculate what you want, but this answers your specific question.

Resources