I have a Table with following Columns:
Account_No, Start_Date, End_Date
I downloaded this table into power query using SQL Select command through ODBC.
Now i want to get sum and count of transactions of all accounts given in the above table from Start_Date to End_Date from another Table. e.g. Transaction_Table. What should i do to get my desired results.
Regards
KAM
You probably don't need Power Query at all at this point.
Assuming your DB server is MS SQL Server 2008 or higher,
WITH t1([Account_No], StartDate, EndDate) As
(
SELECT [Account_No], StartDate = MIN([Start_Date]), EndDate = MAX([End_Date])
FROM Table1
GROUP BY Account_No
)
SELECT
[Account_No]
, Amount = SUM([Field_Transaction_Total])
, [Transaction_Count] = COUNT([Field_Transaction_ID])
FROM [Transaction_Table] t2
INNER JOIN t1 ON t2.[Account_No] = t1.Account_No
AND t2.[Field_Transaction_Date] BETWEEN t1.StartDate AND t1.EndDate
You can also use a copy of a query inside WITH block to get this table with accounts and dates to Excel, if you need it.
If you use another SQL Server, just refactor this code, I hope you've got the idea.
You could use a GROUP BY statement in the SQL you wrote, or you could filter the table based on the Start_Date to the End_Date and then right-click on the column you want to count and choose Group By.
I would start a new Query based on your Transaction_Table. Then I would add a Merge step, joining to your 1st Query on Account_No. Then I would Expand the Start_Date and End_Date from the generated NewColumn.
Next I would Add a Custom Column, and write a formula like this:
= [Transaction_Date] >= [Start_Date] and [Transaction_Date] <= [End_Date]
The resulting column will show TRUE or FALSE. Filter for TRUE.
Finally I would add a Group By step to Sum and Count as required.
I hope I've understood your requirement correctly - it wasn't really clear from your question.
Related
I have the following data in my table:
I need the output to be the following in Snowflake:
It is basically, order by transaction date and getting the first transaction and the last transaction for the country and city and the count of transactions as they are done in sequence. I tried using window functions but I'm not getting the desired result. The tricky part if you can see is that the grouping has to be done but in sequence. You can see TEXAS and CALIFORNIA repeating depending on the sequence of transactions for the country and city.
Best it can be via a query. Second best, in some other way of computation that is fast. Has to be done on batches of data. I don't really want to go to an approach where the data is pulled in an order and then gone through row by row in a sequence unless that is the only option. Open to advises on that as well. Thanks!
Hint: GROUP BY, MIN, MAX, COUNT
I was able to find a logic and the following query works:
select countryid, regionid, min(requesttime), max(requesttime), count(*) from (select deviceid,countryid,regionid,cityid, requesttime,
row_number() over (partition by countryid order by requesttime) as seqnum_1,
row_number() over (partition by countryid, regionid order by requesttime) as seqnum_2
from table t order by requesttime
) t group by countryid, regionid, (seqnum_1 - seqnum_2) order by min(requesttime);
i'm trying to calculate running sum for an account statement using ms-access query but i have problem when there is duplicate dates the sum is not changing until the date changes
this is the img link for results:
https://i.stack.imgur.com/eUfa2.png
and this is the query :
SELECT Trans.TransDate, Trans.Cr, Trans.Dr, (SELECT SUM(t.[Dr]-t.[Cr]) FROM Trans t WHERE t.[TransDate]<= Trans.[TransDate] AND t.Account = Trans.Account) AS Balance
FROM Trans
WHERE (((Trans.Account)="Cash"))
ORDER BY Trans.TransDate;
You can't do that in a query without a unique key. For your sample, you might be able to include the amounts, but that will miss again should two or more records of the account also have the same amount.
In VBA, you could open the query as a recordset and loop the records while you add up the running sum.
I am pulling data from 4 tables in a combination of 3 queries. All 3 queries contain one field that is common "PurchaseOrderNo", I also have a "DateUpdated" & "TimeUpdate" field I think I might be able to use for this issue. The final query produces some filtered data and only the information needed. I am trying to figure out how to specify the query to only produce new data results since the query was last run, if that makes sense. this is my SQL, ignore the filters in place already. Date format = MM/DD/YYYY Time format = ##.####
SELECT po_detail2.PurchaseOrderNo, po_detail2.VendorNo, po_detail2.ItemCode, po_detail2.LotSerialNo, IM068_MXPUnivProdCode.UDF_UNIQUE_KEY, Right([UDF_UNIQUE_KEY],1) AS SIZE_INDEX, Left([UDF_UNIQUE_KEY],Len([UDF_UNIQUE_KEY])-1) AS INVENTORY_KEY
FROM po_detail2 LEFT JOIN IM068_MXPUnivProdCode ON po_detail2.LotSerialNo = IM068_MXPUnivProdCode.LotSerialNo
WHERE (((po_detail2.PurchaseOrderNo)="0056334" Or (po_detail2.PurchaseOrderNo)>"0056334") AND ((po_detail2.ItemCode)="K500" Or (po_detail2.ItemCode)="PC55"))
ORDER BY po_detail2.PurchaseOrderNo DESC;
I ended up using a qry that would send a timestamp to a table then filtering the results since the last qry was run. This is my qry SQL.
INSERT INTO tblQueryLastRun ( dtmQueryLastRun )
VALUES (Now());
The below crashes my DB Browser. Essentially I am trying to sum sales ("sales") by a sales person ("name") that occurred between two dates ("beg_period" and "end_period") pulled from a separate table.
SELECT ta.name, ta.beg_period, ta.end_period,
(SELECT SUM(tb.sales)
FROM sales_log tb
WHERE ta.name = tb.name
AND tb.date BETWEEN ta.beg_period AND ta.end_period
)
FROM performance ta
;
The nested query can be re-written as a single query with a standard join.
SELECT ta.name, ta.beg_period, ta.end_period, SUM(tb.sales)
FROM performance ta INNER JOIN sales_log tb
ON ta.name = tb.name
WHERE tb.date BETWEEN ta.beg_period AND ta.end_period
GROUP BY ta.name, ta.beg_period, ta.end_period;
My guess is that the original query was okay (however inefficient), but DB Browser just didn't know how to interpret the subquery for whatever parsing it attempts, etc. In other words, just because it crashed DB Browser doesn't mean that it would crash sqlite library. Try another sqlite database manager.
I want to store data in MySQL and query it based on the current day. I want to know what is the best practice to do so.
I want to store data totals for each day, so queries total data will be quick. I thought about modeling my table as follows:
TotalsByCountry
- Year
- Month
- Day
- countryId
- totalNumber
When I query the totals for a specific day and for specific country, I will query the table based on 4 columns, the Year, Month, Day and countryId.
I wanted to know if this is a good practice, or there is a better way to do so, like using one columns for data that holds the month, day and year, and query only two columns, the datetime columns and the coutryId.
need you help in choosing the right way to model the table. I also want to make another table that store totals based on gender, so take that into consideration too.
The data will need to be accessed frequently, maybe in real time because I want to show the data changes in real time. I will be developing the web app in asp.net and probably use web sockets to create constant connection that will update the data on the user in real time. So when data changes, it will be reflected on the user webpage in real-time. That's why I need a table modeling that will be ready for many queries. I will use caching for a few seconds so it want stress the db too much.
I hope I provided enough information, if not, please comment and I will reply.
Having three separate columns to store each individual element of a date (year/month/day) will add unnecessary overhead to your database in terms of insert performance and disk space.
What you will want to do is simply have a single DATETIME column to store the date and time, and have a composite index set up on (countryId, datetime_col).
Even if you wanted to query all rows based on a specific day or month, MySQL will still be able to utilize indexes on the DATETIME field, provided that you are writing your queries in the right way and making sure to never to wrap the DATETIME column within a function when you perform your conditional check.
Here is how you can write your query so that it will still be able to utilize indexes:
-- Get the sum of totalNumber of all rows based on current day
-- where countryId = 1
SELECT SUM(totalNumber) AS totalsum
FROM tbl
WHERE countryId = 1 AND
datetime_col >= CAST(CURDATE() AS DATETIME) AND
datetime_col < CAST(CURDATE() + INTERVAL 1 DAY AS DATETIME)
By making the comparison on the bare DATETIME column, the query remains sargable(i.e. able to utilize index range scans) and MySQL will be able to use indexes to quickly look up rows.
On the other hand, if you were to try to wrap the DATETIME column within a function to make the comparison:
-- Get the sum of totalNumber of all rows based on current day
-- where countryId = 1
SELECT SUM(totalNumber) AS totalsum
FROM tbl
WHERE countryId = 1 AND
DATE(datetime_col) = CURDATE()
...It would be quite inefficient because the DATE() function that wraps the column effectively renders the query as non-sargable, and any kind of index you have set up containing the DATETIME column will not be utilized.
You can also efficiently query for the total sum of all rows in the current month:
-- Get the sum of totalNumber of all rows based on current month
-- where countryId = 1
SELECT SUM(totalNumber) AS monthsum
FROM tbl
WHERE countryId = 1 AND
datetime_col >= CAST(CONCAT(YEAR(NOW()), '-', MONTH(NOW()), '-01') AS DATETIME) AND
datetime_col < CAST(CONCAT(YEAR(NOW()), '-', MONTH(NOW()), '-01') AS DATETIME) + INTERVAL 1 MONTH
And within the current year:
-- Get the sum of totalNumber of all rows based on current year
-- where countryId = 1
SELECT SUM(totalNumber) AS yearsum
FROM tbl
WHERE countryId = 1 AND
datetime_col >= CAST(CONCAT(YEAR(NOW()), '-01-01') AS DATETIME) AND
datetime_col < CAST(CONCAT(YEAR(NOW()), '-01-01') AS DATETIME) + INTERVAL 1 YEAR
My argument is:
If you want to be fast on a database lookups, you need well built queries that uses indexes.
Your approach require 4 indexes (that means slower insert), using a single date column you will require just two indexes, Also the query complexity will increase if you ever need to search for date ranges.