I have a SAS dataset for 200million records with Date, Id and amount for year 2022
Have:
ID Date Amount
101 1/31/2022 50
102 1/31/2022 100
101 2/28/2022 25
.....................
I am trying to get the unique # of accounts for each month but the it is taking more than an hour to give me the results
proc sql;
create table unique as select distinct
date,
count(distinct id) as uniqid
from Have
group by date
quit;
Is there an alternate method to have the results quickly?
One alternate available method is to use proc freq with the nlevels option and access its ods output. The one catch is that your data must be sorted by date. If it is not, you'll need to sort your 200M row table by date which could take much longer than a SQL statement alone. If you have to do this, try keeping only the two variables of interest in a temporary table.
proc sort data = have(keep=date id)
out = have_sorted;
by date;
run;
ods select none;
proc freq data=have_sorted nlevels;
by date;
tables id;
ods output nlevels=unique_count(keep = date nlevels
rename = (nlevels = uniqid)
);
run;
Other options include trying to tune your bufno and bufsize system options. Check out this case study from SANrd on how this affects I/O.
Also consider looking into the %freqyeah macro which parallelizes proc freq: From FREQing Slow to FREQing Fast: Facilitating a Four-Times-Faster FREQ with Divide-and-Conquer Parallel Processing
This probably won't be faster by may help you judge SQL performance.
data amount;
input ID :$3. Date :mmddyy. Amount;
format date mmddyy10.;
cards;
101 1/31/2022 50
102 1/31/2022 100
101 2/28/2022 25
101 3/31/2022 50
102 3/31/2022 100
103 4/30/2022 25
103 5/31/2022 50
102 5/31/2022 100
101 6/30/2022 25
101 1/31/2022 50
102 1/31/2022 100
101 2/28/2022 25
101 3/31/2022 50
102 3/31/2022 100
103 4/30/2022 25
103 5/31/2022 50
102 5/31/2022 100
101 6/30/2022 25
;;;;
run;
proc summary data=amount nway;
class date id;
output out=out1(drop=_type_ rename=(_freq_=obs));
run;
proc summary data=out1 nway;
class date;
output out=out2(drop=_type_) sum(obs)=;
run;
proc print;
run;
Related
Good Morning,
I am looking to add 'Previous Year Sales' to the same week into a Teradata pull, but I am failing on the coding. The Baseline I would code to get the sales of that week/year would be:
SELECT a."WEEK_NBR" AS "YearWeek"
, SUM(a."Sales") AS "CurrentYearSales"
FROM "SALESTABLE" AS a
Which would result in:
YearWeek CurrentYearSales
201901 $7,499
201902 $2,300
201903 $6,360
...
202001 $4,500
202002 $9,000
202003 $8,500
I want to be able to have the Prior year's sales same week on the same line, if the prior year is there in the data table. That way the finished table would look like:
YearWeek CurrentYearSales PriorYearSales
201901 $7,499 NULL
201902 $2,300 NULL
201903 $6,360 NULL
...
202001 $4,500 $7,499
202002 $9,000 $2,300
202003 $8,500 $6,360
When I search, all I can find is how to do this with the current week's data, but is this possible with all records in the table?
Edit: As you SUMmed the data you need to aggregate before the join.
You need a self join, assuming WEEK_NBR is numeric:
with cte as
(
SELECT a."WEEK_NBR" AS "YearWeek"
, SUM(a."Sales") AS "CurrentYearSales"
FROM "SALESTABLE" AS a
group by 1
)
select ...
from cte as t1
left join cte as t2
on t2.YearWeek = t1.Ye arWeek- 100
As #Andrew noted, this will also work for a string because Teradata will do an automatic typecast to a float when you add a number to a string or compare numeric and string (I would prefer writing an explicit type cast though)
I am planning to store query data in sqlite3 database.
I have these fields in sqlite3
UNIX_EPOCH, CUMULATIVE_QUERY_RATE
1452128581, 150
1452128582, 190
1452128583, 220
1452128584, 270
I want to get queries-per-second column as below:
QPS
0
40
30
50
how do I do it in sqlite3.
You simply have to subtract the value of the previous second:
SELECT unix_epoch,
(SELECT T1.cumulative_query_rate - T2.cumulative_query_rate
FROM SuperSecretTableName AS T2
WHERE T1.unix_epoch - 1 = T2.unix_epoch
) AS qps
FROM SuperSecretTableName AS T1;
I am trying to select multiple values from the same table. I need to select count values of the column rfid and process_status from the posts table and station_title from stations table.
Here are the two tables:
Posts table :
Id ownerId rfid stationId stationType process_status
1 107 rfid1 raj1222681607 like pending
2 107 rfid1 raj1222681607 like pending
3 107 rfid1 raj1125396157 like pending
4 107 rfid1 raj1222681607 like
5 107 rfid2 raj1222681607 like pending
6 107 rfid3 raj1222681607 like
Stations table :
Id title ownerId stationId stationLike stationPic
1 Check-in one 107 raj1125396157 1 0
2 nfc station 01 107 raj1222681607 1 0
From these two tables I want to fetch data as
Total RFIDs : 5
Total Pending : 3
Station Title : nfc station 01
The where clause conditions are : ownerId = 107 and the stationId = 'raj1222681607' and the process_status = 'pending'
So far I can achieve the total rfids, station title value; but i am not able to get the total pending value counting the process status.
My query snippet:
SELECT
COUNT(p.rfid) as TotalTap,
COUNT(p.process_status) as TotalPending,
s.title
FROM posts p
inner join
stations s
on p.stationId = s.stationId
WHERE
p.ownerId = 107 AND p.stationId = 'raj1222681607'
AND p.process_status = 'pending';
But this is giving the wrong output as :
Total RFIDs : 3 (THIS IS WRONG!!)
Total Pending : 3
Station Title : nfc station 01
To compute how many rows match a condition, use SUM over a boolean expression:
SELECT COUNT(*) AS TotalTap,
SUM(process_status = 'pending') AS TotalPending
FROM Posts
WHERE ownerId = 107
AND stationId = 'raj1222681607';
It does not really make sense to try to compute the third value in the same query; just use a separate, much simpler query:
SELECT title
FROM Stations
WHERE stationId = 'raj1222681607';
Android has a helper function for this:
String title = DatabaseUtils.stringForQuery(db,
"SELECT title FROM Stations WHERE stationId = ?",
new String[]{ "raj1222681607" });
I have a sample table with following values
SNO | Mon
-----+-------
100 | 1
101 | 1
102 | 1
100 | 2
101 | 2
102 | 2
100 | 3
101 | 3
Now I need a query to count the total sno's which are in 3 months
The result should be 2, as 100 & 101 are in mon 1,2 and 3. However, 102 is only present in mon 1,2.
Thanks,
RK
This Query in theory should work.
SELECT
tmpTbl.sNo
FROM
tmpTbl
GROUP BY
tmpTbl.sNo
HAVING
Count(tmpTbl.monNo) = (SELECT Count(*) FROM (SELECT tmpTbl.monNo FROM tmpTbl GROUP BY tmpTbl.monNo));
The result would be,
sNo
----
100
101
I have used two SubQueries to get the result. Teh both are used in the HAVING clause of the SQL. First SqubQuery (inner most). Will get the number of Unique Month's available in your table, the outer SubQuery will then Count the number of Unique months. So the Overall Query can be translated as "SELECT the serial number FROM the table HAVING the Count of Month equal to the Number of unique records in the same table".
The reason I used SbQuery instead of a number is because of the fact this will also be applicable when your month number increases. Hope this helps !
EDIT
Here is the Query for getting the count.
SELECT
Count(*) As simpleCount
FROM
(
SELECT
tmpTbl.sNo
FROM
tmpTbl
GROUP BY
tmpTbl.sNo
HAVING
Count(tmpTbl.monNo) = (SELECT Count(*) FROM (SELECT tmpTbl.monNo FROM tmpTbl GROUP BY tmpTbl.monNo))
);
MY tables
table name:emp
Name Null Type
------------------------------ -------- --------------------------------------------------------------- NUMBER
EMP_NAME VARCHAR2(10)
ADDRESS VARCHAR2(15)
PH_NO NUMBER(10)
DPT_NO NUMBER
result:
1 ram ctr 8892939927 100
2 mohan ptr 7569936347 101
3 mallu ppt 9553438342 102
4 scoot dmc 9874563210 103
5 jim plr 9236548875 104
6 ravi tpt 8562398756 105
7 manju hyd 7562398742 106
8 manoj hyd 869523654 107
9 sarath ctr 9632158769 108
10 hemanth mpk 9632147852 109
table anme: emp_department
Name Null Type
------------------------------ -------- -------------------------------------------------------------
EMP_NO NUMBER
DPT_NO NUMBER
PERIOD VARCHAR2(10)
now how to exatract(copy) emp_no and dpt_no values to emp_ department
I think you mean you want to populate emp_department from the data in the emp table - hopefully so the dpt_no column in that table can be removed. You haven't shown the emp_no column in emp, but it was in your earlier questions so I assume that's a typo. Your data model still seems a bit strange.
So to insert values from another table:
insert into emp_department (emp_no, dpt_no)
select emp_no, dpt_no
from emp;
You seem to be asking vey basic questions without showing much research. Please read the documentation or at least some tutorials and show what you've tried and what specific problems you're having.