2 columns running total in access - ms-access-2010

I am developing an small access database to track my stock. I have two columns (In and Out). I would like to create a running total in a query and report to show the +- balance when a stock is received or item leaves the stock.
Example:
Stock ID Opening Date In Out RunTot
1 500 1/2/2019 60 0 30
3 200 1/10/2019 40 0 70
1 1/30/2019 0 5 65
Up to now, I have tried couple of codes but not producing the result I need.
RunTot: DSum("[QtyIn] + [Opening] - [QtyOut]","tblTransaction","[StockID] <= " & [StockID])
Your assistance will be highly appreciated.
Thanks,
Oscar

Related

Updating a table with the rolling average of previous rows in R?

So I have a table where every row represents a given user in a specific event. Each row contains two types of information: the outcomes of such event, as well as data regarding a user specifically. Multiple users can take part in the a same event.
For clarity, here is an simplified example of such table:
EventID Date Revenue Time(s) UserID X Y Z
1 1/1/2017 $10 120 1 3 2 2
1 1/1/2017 $15 150 2 2 1 2
2 2/1/2017 $50 60 1 1 5 1
2 2/1/2017 $45 100 4 3 5 2
3 3/1/2017 $25 75 1 2 3 1
3 3/1/2017 $20 210 2 5 5 1
3 3/1/2017 $25 120 3 1 0 4
3 3/1/2017 $15 100 4 3 1 1
4 4/1/2017 $75 25 4 0 2 1
My goal is to build a model that can, given a specific user's performance history (in the example attributes X, Y and Z), predict a given revenue and time for an event.
What I am after now is a way to format my data in order to train and test such model. More specifically, I want to transform the table in a way that each row would keep the event specific information, while presenting the moving average of each users attributes up until the previous event. An example of the thought process could be: a user up until an event presents averages of 2, 3.5, and 1.5 in attributes X, Y and Z respectively, and the revenue and time outcomes of such event were $25 and 75, now I will use this as a input for my training.
Once again for clarity, here is an example of the output I would expect applying such logic on the original table:
EventID Date Revenue Time(s) UserID X Y Z
1 1/1/2017 $10 120 1 0 0 0
1 1/1/2017 $15 150 2 0 0 0
2 2/1/2017 $50 60 1 3 2 2
2 2/1/2017 $45 100 4 0 0 0
3 3/1/2017 $25 75 1 2 3.5 1.5
3 3/1/2017 $20 210 2 2 1 2
3 3/1/2017 $25 120 3 0 0 0
3 3/1/2017 $15 100 4 3 5 2
4 4/1/2017 $75 25 4 3 3 1.5
Notice that in each users first appearance all attributes are 0, since we still know nothing about them. Also, in a user's second appearance, all we know is the result of his first appearance. In lines 5 and 9, users 1 and 4 third appearances start to show the rolling mean of their previous performances.
If I were dealing with only a single user, I would tackle this problem by simply calculating the moving average of his attributes, and then shifting only the data in the attribute columns down one row. My questions are:
Is there a way to perform such shift filtered by UserID, when dealing with a table with multiple users?
Or is there a better way in R to calculate the rolling mean directly from the original table by always placing a result in each user's next appearance?
It can assumed that all rows are already sorted by date. Any other tips or references related to this problem are also welcome.
Also, It wasn't obvious how to summarize my question with a one liner title, so I'm open to suggestions from any R experts that might think of an improved way of describing it.
We can achieve your desired output using the dplyr package.
library(dplyr)
tablinka %>%
arrange(UserID, EventID) %>%
group_by(UserID) %>%
mutate_at(c("X", "Y", "Z"), cummean) %>%
mutate_at(c("X", "Y", "Z"), lag) %>%
mutate_at(c("X", "Y", "Z"), funs(ifelse(is.na(.), 0, .))) %>%
arrange(EventID, UserID) %>%
ungroup()
We arrange the data, group it, and then apply the desired transformations (the dplyr functions cummean, lag, and replacing NA with 0 using an ifelse).
Once this is done, we rearrange the data to its original state, and ungroup it.

teradata updating a table in a cost effective manner

I have a table that has sequence numbers. Its a very big table 16 million rows give or take. The table has a key and it has events that happen to that key. Every time the key changes the seq_nums restarts in theory.
In the original table I had there was a timestamp associated with each event. In order to get the duration of the event i created a lag column and subtracted the lag column from the time stamp of the current event giving us the duration. This duration is called time_in_minutes in the table below
The new table has a number of properties
Each key in this case is a car wash with each event being assigned a category so on line 3 the car was submitted to a drying procedure for 45 mins
The second line which contains 23 mins, isn't actually 23 mins for the wash, it took the machine 23 minutes to power up
In ID number 144 the record for the powering up of the machine is missing. This seems to be prevalent in the data set
key Event time in mins seq_num
1 Start 0 1
1 Wash 23 2
1 Dry 45 3
1 Wash 56 4
1 Wash 78 5
1 Boil 20 6
1 ShutDown 11 7
2 Start 0 1
2 Wash 11 2
2 Dry 12 3
-------------------------------------------
144 Wash 0 1
144 Wash 11 2
144 Dry 12 3
I would like to move the time_in_mins to the seq_num 1 if is an Event of type Start in the previous record. So when we aggregate this later the minutes will be properly assigned to starting up
I could try and update the table by creating a new column again with another lag for time_in_mins this time but this seems to be quite expensive
Does anyone know of a clever way of doing this?
Edit 14/10/2016
The final output for the customer is like below albeit slightly out of order
key event total minutes
1 Start 23
1 Boil 20
1 Dry 45
1 Wash 134
1 ShutDown 11
2 Start 11
2 Dry 12
2 Wash 0
Thanks for your help
This will switch 1st and 2nd value based on your description, resulting in a single STAT-step in Explain:
SELECT key, seq_num, event,
CASE
WHEN seq_num = 1
AND Event = 'Start'
THEN Min(CASE WHEN seq_num = 2 THEN time_in_mins ELSE 0 END)
Over (PARTITION BY key)
WHEN seq_num = 2
AND Min(CASE WHEN seq_num = 1 THEN Event END)
Over (PARTITION BY key) = 'Start' THEN 0
ELSE time_in_mins
END AS new_time_in_mins
FROM tab
Now you can do the sum.
But it might be possible to include the logic in your previous step when you create the Voltile Table, can you add this Select, too?

An efficient way to find the row number of a data frame, unequal condition

We are looking at a delay of a server that can only take care of one customer simultaneously. Let's say we have two data frames: agg_data and ind_data.
> agg_data
minute service_minute
1 0 1
2 60 3
3 120 2
4 180 3
5 240 2
6 300 4
agg_data provides service time between two successive customers for every hour. For instance, between 60 and 120 (the second hour from the beginning), we can serve a new customer every 3 minutes and we can in total serve 20 customers for that given hour.
ind_data provides arrival minutes of each customer:
Arrival
1 51
2 63
3 120
4 121
5 125
6 129
I need to generates the departure minutes for the customers, which are affected by the service_minute in the agg_data.
The output looks like:
Arrival Dep
1 51 52
2 63 66
3 120 122
4 121 124
5 125 127
6 129 131
Here is my current code, which is correct but very inefficient:
ind_data$Dep = rep(0,now(ind_data))
# After the service time, the first customer can leave the system with no delay
# Service time is taken as that of the hour when the customer arrives
ind_data$Dep[1] = ind_data$Arrival[1] + agg_data[max(which(agg_data$minute<=ind_data$Arrival[1])),'service_minute']
# For customers after the first one,
# if they arrive when there is no delay (arrival time > departure time of the previous customer),
# then the service time is that of the hour when the arrive and
# departure time is arrival time + service time;
# if they arrive when there is delay (arrival time < departure time of the previous customer),
# then the service time is that of the hour when the previous customer leaves the system and
# the departure time is the departure time of the previous customer + service time.
for (i in 2:nrow(ind_data)){
ind_data$Dep[i] = max(
ind_data$Dep[i-1] + agg_data[max(which(agg_data$minute<=ind_data$Dep[i-1])),'service_minute'],
ind_data$Arrival[i] + agg_data[max(which(agg_data$minute<=ind_data$Arrival[i])),'service_minute']
)
}
I think it is the step where we search for the right service time to use in agg_data takes long. Is there a more efficient algorithm?
Thank you.
This should be fairly efficient. It's a very simple lookup problem with an obvious vectorized solution:
out <- data.frame(Arrival = ind_data$Arrival,
Dep = ind_data$Arrival + agg_data$service_minute[ # need an index to choose min
findInterval(ind_data$Arrival, agg_data$minute)]
)
> out
Arrival Dep
1 51 52
2 63 66
3 120 122
4 121 123
5 125 127
6 129 131
I trust my code more than your example. I think there are obvious errors in it.

PL/SQL Return Only 1 Row for Each Grouping

I have table that lists items like below. It basically has Operation Numbers (OP_NO) that tell where a product is at in the process. These OP Numbers can be either Released or Completed. They follow a process as in 10 must happen before 20, 20 must happen before 30 etc. However users do not update all steps in reality so we end up with some items out of order complete while the earlier steps are not as show below (OP30 is completed but OP 10 and 20 are not).
I basically want to produce a listing of each ORDER_ID showing the furthest point of completion for each ORDER_ID. I figured I could do this by querying for STATUS = 'Completed' and Sorting by OP_NO Desc. However I can't figure out how to produce only 1 result for each ORDER_ID. For example in ORDER_ID 345 Steps 10 and 20 are completed. I would only want to return that STEP 20 is where it is currently at. I was figuring I could do this with 'WHERE ROWNUM <= 1' but haven't had much luck. Could any experts weigh in?
Thanks!
ORDER_ID | ORDER_SEC | ORDER_RELEASE | OP_NO | STATUS | Description
123 2 3 10 Released Op10
123 2 3 20 Released Op20
123 2 3 30 Completed Op30
123 2 3 40 Released Op40
345 1 8 10 Completed Op10
345 1 8 20 Completed Op20
345 1 8 30 Released Op30
345 1 8 40 Released Op40
If I understand correctly what you want the below should do what you need. Just replace test table with your table name.
select *
from test_table tst
where status = 'Completed'
and op_no = (select max(op_no)
from test_table tst1
where tst.order_id = tst1.order_id
and status = 'Completed');
Given your sample data this produced the below results.
Order_Id Order_Sec Order_Release op_no Status Description
123 2 3 30 Completed Op30
345 1 8 20 Completed Op20
Cheers
Shaun Peterson

I have two rows with different id's.I want to add both the rows and display it as One

I have a derived Table from two different tables using count and group by functions for which the output is like this.
Department DeptID Count(noofemployees)
HR 1 60
Accounting 19 7
Computers 4 67
Sys admin 6 5
Finance 3 15
Admin 9 12
Now I am trying to add and display
HR + Accounting + Finance = 10 + 7 + 13 = 30 as HR
Computers + Sys admin = 65 + 5 =70 as Computers
Department DeptID Count(noofemployees)
HR 1 30
Computers 4 70
Admin 9 12
Can you please help out on this.
I have found the answer to this ,
I could not call the Department name on this but can just show the sum of COUNT(Noofemployees) , since we cannot group by
we can use
SELECT SUM(CASE WHEN Department IN('HR','Accounting','Finance' THEN 1 ELSE 0)HR,
SUM(CASE WHEN Department IN('Computers','sys admin' THEN 1 ELSE 0)computers,
SUM(CASE WHEN Department IN('Admin' THEN 1 ELSE 0)Admin
FROM table_name.
this will give me an output:
HR Computers Admin
30 70 12
Not Quite the output I am looking for, but helps for now.
If anyone can help me getting the output I am looking for that would great.
Thank you

Resources