I have a SQLite table of user actions on a website. Each row is the same action on a web site, just different time/date, tagged with a user id. The table has more than 20Million entries. I understand how to get a count by user (i.e. A took the action 3 times, B 4, C 2, D 4, etc.) using the group by function by user id. In other words this works fine:
select count(uid) as event_count
from table
group by uid
What I want is the data for a statistical distribution which is a count of the number of users who only took 1 action, a count of users that took 2 actions, etc. Said another way: The list might look something like:
1 | 339,440
2 | 452,555
3 | 99,239
5 | 20,209
etc. ...
I could use the having event_count = n clause and just rerun the query for every integer until all were accounted for but that seems silly. There must be a way that I can get a single list with two columns: the group size and the count of the users who all took the exact same number of actions.
As simply as adding another grouping above:
select event_count, count(*) as users_count
from
(select count(uid) as event_count
from table
group by uid) t
group by event_count
order by event_count
Related
A) List each lecturer together with each module they teach and the number of students studying that module, in order of the lecturer name.
B) Output the number of modules in which everyone passed the module (assuming pass mark is 40).
For A, you need to join the 2 tables and then group by lecturer, module and count the number of rows for each group (each row corresponds to a student):
select t.lecturer, t.module, count(*) numberofstudents
from teaches t inner join studies s
on s.module = t.module
group by t.lecturer, t.module
order by t.lecturer
For B, use NOT EXISTS to find the modules where all grades are >= 40 and count them:
select count(distinct module) numberofmodules
from studies s
where not exists (
select 1 from studies
where module = s.module and grade < 40
)
i need to get the total quantity of results for each person but i get ...
resultado
MY QUERY..
select t.fecha_hora_timbre,e.nombre,e.apellido,d.descripcion as departamento_trabaja, t.fecha,count(*)
from fulltime.timbre t, fulltime.empleado e, fulltime.departamento d
where d.depa_id=e.depa_id and t.codigo_empleado=e.codigo_empleado and
trunc(t.fecha) between trunc(to_date('15/02/2017','dd/mm/yyyy')) and trunc(to_date('14/03/2017','dd/mm/yyyy'))
group by t.fecha_hora_timbre,e.nombre,e.apellido,d.descripcion, t.fecha
Expected data...
NOMBRE | APELLIDO | DEPARTAMENTO_TRABAJA | VECES_MARCADAS(count)
MARIA TARCILA IGLESIAS BECERRA ALCALDIA 4
KATHERINE TATIANA SEGOVIA FERNANDEZ ALCALDIA 10
FREDDY AGUSTIN VALDIVIESO VALLEJO ALCALDIA 3
UPDATE..
select e.nombre,e.apellido,d.descripcion as departamento_trabaja,COUNT(*)
from fulltime.timbre t, fulltime.empleado e, fulltime.departamento d
where d.depa_id=e.depa_id and t.codigo_empleado=e.codigo_empleado and
trunc(t.fecha) between trunc(to_date('15/02/2017','dd/mm/yyyy')) and trunc(to_date('14/03/2017','dd/mm/yyyy'))
group by t.fecha_hora_timbre,e.nombre,e.apellido,d.descripcion, t.fecha
You should only select and group by the non-aggregate columns you actually want to count against. At the moment you're including the fecha_hora_timbre and fechacolumns in each row, so you're counting the unique combinations of those columns as well as the name/department information you actually want to count.
select e.nombre, e.apellido, d.descripcion as departamento_trabaja,
count(*) a veces_marcadas
from fulltime.timbre t
join fulltime.empleado e on t.codigo_empleado=e.codigo_empleado
join fulltime.departamento d on d.depa_id=e.depa_id
where t.fecha >= to_date('15/02/2017','dd/mm/yyyy')
and t.fecha < to_date('15/03/2017','dd/mm/yyyy')
group by e.nombre, e.apellido, d.descripcion
I've removed the extra columns. Notice that they have gone from both the select list and the group-by clause. If you have a non-aggregate column in the select list that isn't in the group-by you'll get an ORA-00937 error; but if you have a column in the group-by that isn't in the select list then it will still group by that even though you can't see it and you just won't get the results you expect.
I've also changed from old-style join syntax to modern syntax. And I've changed the date comparison; firstly because doing trunc() as part of trunc(to_date('15/02/2017','dd/mm/yyyy')) is pointless - you already know the time part is midnight, so the trunc doesn't achieve anything. But mostly so that if there is an index on fecha that index can be used. If you do trunc(f.techa) then the value of every column value has to be truncated, which stops the index being used (unless you have a function-based index). As between in inclusive, using >= and < with one day later on the higher limit should have the same effect overall.
I was trying to find which customer has more number of records in a table, i got suggested by RANK function but its not the useful in finding the exact record , so i used this following snippet:
select count(customerkey),customerkey
FROM FILEMAPPERTEMPLATE
group by customerkey;
Result :
1 298,254
1 299,732
2 246,027
43 197,053
1 299,745
1 299,751
60 271,623
Though i am able to find how many reocrds attributed to a customerkey in the table, I couldn't find the single exact record(after executing the query ) that has maximum record fro a customer. Please help
I want only
60 271,623 as reult
select * from (select count(customerkey) cnt,customerkey
FROM FILEMAPPERTEMPLATE
group by customerkey order by cnt desc) where rownum<2;
I'm making a report in Cognos Report Studio and I'm having abit of trouble getting a count taht I need. What I need to do is count the number of IDs for a department. But I need to split the count between initiated and completed. If an ID occures more than once, it is to be counted as completed. The others, of course, will be initiated. So I'm trying to count the number of ID occurences for a distinct ID. Here is the query I've made in SQl Developer:
SELECT
COUNT((CASE WHEN COUNT(S.RFP_ID) > 8 THEN MAX(CT.GCT_STATUS_HISTORY_CLOSE_DT) END)) AS "Sales Admin Completed"
,COUNT((CASE WHEN COUNT(S.RFP_ID) = 8 THEN MIN(CT.GCT_STATUS_HISTORY_OPEN_DT) END)) as "Sales Admin Initiated"
FROM
ADM.B_RFP_WC_COVERAGE_DIM S
JOIN ADM.B_GROUP_CHANGE_REQUEST_DIM CR
ON S. RFP_ID = CR.GCR_RFP_ID
JOIN ADM.GROUP_CHANGE_TASK_FACT CT
ON CR.GROUP_CHANGE_REQUEST_KEY = CT.GROUP_CHANGE_REQUEST_KEY
JOIN ADM.B_DEPARTMENT_DIM D
ON D.DEPARTMENT_KEY = CT.DEPARTMENT_RESP_KEY
WHERE CR.GCR_CHANGE_TYPE_ID = '20'
AND S.RFP_LOB_IND = 'WC'
AND S.RFP_AUDIT_IND = 'N'
AND CR.GCR_RECEIVED_DT BETWEEN '01-JAN-13' AND '31-DEC-13'
AND D.DEPARTMENT_DESC = 'Sales'
AND CT.GCT_STATUS_IND = 'C'
GROUP BY S.RFP_ID ;
Now this works. But I'm not sure how to translate taht into Cognos. I tried doing a CASE taht looked liek this(this code is using basic names such as dept instead of D.DEPARTMENT_DESC):
CASE WHEN dept = 'Sales' AND count(ID for {DISTINCT ID}) > 1 THEN count(distinct ID)END)
I'm using count(distinct ID) instead of count(maximum(close_date)). But the results would be the same anyway. The "AND" is where I think its being lost. It obviously isn't the proper way to count occurences. But I'm hoping I'm close. Is there a way to do this with a CASE? Or at all?
--EDIT--
To make my question more clear, here is an example:
Say I have this data in my table
ID
---
1
2
3
4
2
5
5
6
2
My desired count output would be:
Initiated Completed
--------- ---------
4 2
This is because two of the distinct IDs (2 and 5) occure more than once. So they are counted as Completed. The ones that occure only once are counted as Initiated. I am able to do this in SQl Dev, but I can't figure out how to do this in Cognos Report Studio. I hope this helps to better explaine my issue.
Oh, I didn't quite got it originally, amending the answer.
But it's still easiest to do with 2 queries in Report Studio. Key moment is that you can use a query as a source for another query, guaranteeing proper group by's and calculations.
So if you have ID list in the table in Report Studio you create:
Query 1 with dataitems:
ID,
count(*) or count (1) as count_occurences
status (initiated or completed) with a formula: if (count_occurences > 1) then ('completed') else ('initiated').
After that you create a query 2 using query one as source with just 2 data items:
[Query1].[Status]
Count with formula: count([Query1].[ID])
That will give you the result you're after.
Here's a link to doco on how to nest queries:
http://pic.dhe.ibm.com/infocenter/cx/v10r1m0/topic/com.ibm.swg.ba.cognos.ug_cr_rptstd.10.1.0.doc/c_cr_rptstd_wrkdat_working_with_queries_rel.html?path=3_3_10_6#cr_rptstd_wrkdat_working_with_queries_rel
I have a SQLite table which contains a numeric field field_name. I need to group by ranges of this column, something like this: SELECT CAST(field_name/100 AS INT), COUNT(*) FROM table GROUP BY CAST(field_name/100 AS INT), but including ranges which have no value (COUNT for them should be 0). And I can't get how to perform such a query?
You can do this by using a join and (though kludgy) an extra table.
The extra table would contain each of the values you want a row for in the response to your query (this would not only fill in missing CAST(field_name/100 AS INT) values between your returned values, but also let you expand it such that if your current groups were 5, 6, 7 you could include 0 through 10.
In other flavors of SQL you'd be able to right join or full outer join, and you'd be on your way. Alas, SQLite doesn't offer these.
Accordingly, we'll use a cross join (join everything to everything) and then filter. If you've got a relatively small database or a small number of groups, you're in good shape. If you have large numbers of both, this will be a very intensive way to go about this (the cross join result will have #ofRowsOfData * #ofGroups rows, so watch out).
Example:
TABLE: groups_for_report
desired_group
-------------
0
1
2
3
4
5
6
Table: data
fieldname other_field
--------- -----------
250 somestuff
230 someotherstuff
600 stuff
you would use a query like
select groups_for_report.desired_group, count(data.fieldname)
from data
cross join groups_for_report
where CAST(fieldname/100.0 AS INT)=desired_group
group by desired_group;