Data Manipulation: Dynamic observation Drop - r

I am working on credit line assimilation data.
df_01 <- data.frame(
user_id = c(1,1,1,1,1,1,1,1,1,2,2),
var_type = c("withdraw","repaid","repaid","withdraw","repaid","repaid","repaid","withdraw","repaid","withdraw","repaid"),
withdraw_id = c("u1_w1","u1_w1","u1_w1","u1_w2","u1_w2","u1_w2","u1_w2","u1_w3","u1_w3","u2_w1","u2_w1"),
repaid_id = c("","u1_w1_r1","u1_w1_r2","","u1_w2_r1","u1_w2_r2","u1_w2_r3","","u1_w3_r1","","u2_w2_r1"),
amt = c(-50,30,20,-60,20,30,10,-40,50,-30,10),
credit_limit = c(100,100,100,100,100,100,100,100,100,50,50),
running_balance = c(50,80,100,40,60,90,100,60,110,40,40),
new_credit_limit = c(50,50,50,50,50,50,50,50,50,40,40),
new_running_balance = c(0,30,50,-10,NA,NA,NA,10,60,0,40),
drop_obs_flag = c(0,0,0,1,1,1,1,0,0,0,0)
)
here
var_type: has two types i.e withdraw, repaid
repaid id is associated with withdraw_id
we are trying to see, if we alloted different credit line to users, how will it affect our business
Objective: as per new credit line, if withdraw amount is greater than, previous new running balance for a user_id, we have to drop all rows associated with that withdraw_id. i.e drop_obs_flag
remarks_1: if the earlier withdraw obs is dropped, then we have to check the next withdraw id (it's in loop) for that user_id,
remarks_2: last 2 columns, new_running_balance and drop_obs_flag is output variable

Related

Why do I get an "ambiguous column name: account.accrued"?

I am trying to execute an SQLite query to update a column in my accounts table:
UPDATE account SET accrued = (account.accrued + ((product.intrate/365)*balance))
FROM account
JOIN customer ON customer.custid = account.custid
JOIN product ON product.prodid = account.prodid
WHERE active = 1
I have tried this, but it comes up with the result
ambiguous column name: account.accrued
UPDATE a SET accrued = (a.accrued + ((p.intrate/365)*balance))
FROM account a
JOIN customer c ON c.custid = a.custid
JOIN product p ON p.prodid = a.prodid
WHERE active = 1
I have also tried that query, but the result comes up with no such table a. If I take out the column accrued from the calculation I then get the same error for the balance column.
The column accrued is only in the one table, account.
This is the correct syntax for SQLite:
UPDATE account AS a
SET accrued = (a.accrued + ((p.intrate/365)*balance))
FROM customer c JOIN product p
ON p.prodid = a.prodid
WHERE c.custid = a.custid AND active = 1
Note that you should qualify all the column names (except the column that is updated after SET) with the table name/alias to avoid ambiguities.

Peoplesoft Learning Management - First Time Pass Rate

We are working on PeopleSoft ELM 9.2 and I'm not finding what I need in any of the OOB queries so I'm attempting to build one. We need to get a "First Time Pass Rate" score, or the average of how many people pass a learning course on their first try. Ideally we would have the data for each attempt so that we could also see how many attempts a particular person or course has prior to getting a passing score.
Our setup includes SCORM 1.2 modules that pass off a score to the LMS for verification that a passing score was received. The closest I've been able to come so far is to get a "Pass/Fail" score but either my query unions are incorrect or the data is incorrect as it returns extraneous data that is not relevant to the other data returned or (if I force distinct values) it returns sporadically accurate data. Below is the SQL that I'm using if it helps.
Has anyone tried to build this kind of query before and how did you do it? :)
SELECT DISTINCT D1X.XLATLONGNAME, D.LM_ORGANIZATION_ID, D.LM_ORG_DESCR,
D.LM_HR_EMPLID, TO_CHAR(D.LM_HIRE_DT,'YYYY-MM-DD'), D.FIRST_NAME,
D.LAST_NAME, C.LM_CS_LONG_NM, C.LM_ACT_CD, A.LM_LC_LONG_NM, A.LM_LC_ID,
B12X.XLATLONGNAME, TO_CHAR(B.LM_COMPL_DT,'YYYY-MM-DD'), B.LM_ENRLMT_ID,
E15X.XLATLONGNAME
FROM PS_LM_LC A, PS_LM_ENRLMT B LEFT OUTER JOIN PSXLATITEM B12X ON
B12X.FIELDNAME='LM_STTS' AND B12X.FIELDVALUE=B.LM_STTS AND B12X.EFF_STATUS =
'A' AND B12X.EFFDT = (SELECT MAX(EFFDT) FROM PSXLATITEM TB WHERE
TB.FIELDNAME=B12X.FIELDNAME AND TB.FIELDVALUE=B12X.FIELDVALUE AND
TB.EFF_STATUS = 'A' AND TB.EFFDT <= TO_DATE(TO_CHAR(SYSDATE,'YYYY-MM-
DD'),'YYYY-MM-DD') ), PS_LM_ACT_CI_VW C, PS_LM_PERS_DTL_VW D LEFT OUTER JOIN
PSXLATITEM D1X ON D1X.FIELDNAME='LM_ACTIVE' AND D1X.FIELDVALUE=D.LM_ACTIVE
AND D1X.EFF_STATUS = 'A' AND D1X.EFFDT = (SELECT MAX(EFFDT) FROM PSXLATITEM
TB WHERE TB.FIELDNAME=D1X.FIELDNAME AND TB.FIELDVALUE=D1X.FIELDVALUE AND
TB.EFF_STATUS = 'A' AND TB.EFFDT <= TO_DATE(TO_CHAR(SYSDATE,'YYYY-MM-
DD'),'YYYY-MM-DD') ), PS_LM_ENR_LC_BL_VW E LEFT OUTER JOIN PSXLATITEM E15X
ON E15X.FIELDNAME='LM_PASS_STTS' AND E15X.FIELDVALUE=E.LM_PASS_STTS AND
E15X.EFF_STATUS = 'A' AND E15X.EFFDT = (SELECT MAX(EFFDT) FROM PSXLATITEM TB
WHERE TB.FIELDNAME=E15X.FIELDNAME AND TB.FIELDVALUE=E15X.FIELDVALUE AND
TB.EFF_STATUS = 'A' AND TB.EFFDT <= TO_DATE(TO_CHAR(SYSDATE,'YYYY-MM-
DD'),'YYYY-MM-DD') )
WHERE ( A.LM_ACT_ID = B.LM_ACT_ID
AND A.LM_ACT_ID = C.LM_ACT_ID
AND D.LM_PERSON_ID = B.LM_PERSON_ID
AND D.BUSINESS_UNIT IN ('00340','00235')
AND B.LM_ENRL_DT BETWEEN TO_DATE(:1,'YYYY-MM-DD') AND TO_DATE(:2,'YYYY-MM-DD')
AND D.LM_HR_EMPLID = :3
AND D.LM_ACTIVE = :4
AND B.LM_STTS = 'COMP'
AND A.LM_LC_ID = E.LM_LC_ID
AND E.LM_PASS_STTS IN ('FAIL','PASS'))

Sqlite trigger to calculate an ID based on an entry type

I am trying to setup a trigger that will auto calculate an ID field based on the sum of a specific type of entry. I have it working where the ID number in the ID indexes based on the number of all entries
> BEGIN
> UPDATE master_workorders
> SET wo_no = master_workorders.wo_sub || substr('0000'||master_workorders.pkuid, -4,4)||'-'|| substr(master_workorders.rdate,3,2)
> WHERE rowid = NEW.rowid;
> END
This returns ID's like WO0001-17 BB0002-17 and M0003-17 each ID number (middle 4 digits) is an entry. I want my ID numbers to represent the number of each type (WO, BB, M these values are stored in the wo_sub column) as WO0001-17 BB0001-17 M0001-17 and if a new BB work order is added it would be BB0002-17 and so on for each type.
To replace the autoincremented ID with the current count, replace master_workorders.pkuid with a subquery:
... || (SELECT COUNT(*) FROM master_workorders WHERE wo_sub = NEW.wo_sub) || ...

Shiny: BigQuery Fails when user selects "All" value

I am trying to use a BigQuery query to populate plots in Shiny. The query includes input values from the ui using selectInput. If the user selects a value that exists in the DB, such as year is 2014, the query works correctly, however, I would like the user to also be able to select "All." "All" should be a selection of all values, however, I am not sure how to express that in the query using selectInput.
server.r
data1 <- eventReactive(input$do_sql, {
bqr_auth(token = NULL, new_user = FALSE, verbose = FALSE)
query = paste('select month, event, partner_name, sum(f0_) from [dataset.table] where year =',input$year1,' and partner_name = \"',input$partner_name,'\"
GROUP by 1,2,3
ORDER by 1 asc
LIMIT 10000', sep="")
bqr_query(projectId, datasetId, query, maxResults =2000)
})
ui.r
(
selectInput("year1",
"Year:",
c("All",2014,2015
))
),
(
selectInput("partner_name",
"Partner:",
c("All",
unique(as.character(data5$partner_name))))
You should slightly change the query you are constructing
So, currently you have
SELECT month, event, partner_name, SUM(f0_)
FROM [dataset.table]
WHERE year = selected_year
AND partner_name = "selected_partner_name"
GROUP BY 1,2,3
ORDER BY 1 ASC
LIMIT 10000
with respectively:
selected_year --> input$year1
selected_partner_name --> input$partner_name
Instead, you should construct below query
SELECT month, event, partner_name, SUM(f0_)
FROM [dataset.table]
WHERE (year = selected_year OR "selected_year" = "All")
AND (partner_name = "selected_partner_name" OR "selected_partner_name" = "All")
GROUP BY 1,2,3
ORDER BY 1 ASC
LIMIT 10000
I am not shiny user at all - so excuse my syntax - below is just my
guess with regard of implementing above suggestion
query = paste('SELECT month, event, partner_name, sum(f0_)
FROM [dataset.table]
WHERE (year =',input$year1,' OR "All" ="',input$year1,'")
AND (partner_name = \"',input$partner_name,'\" OR "All" = \"',input$partner_name,'\")
GROUP by 1,2,3
ORDER by 1 asc
LIMIT 10000', sep="")
Mikhail's solution worked perfectly for character variables, but numerics didn't work correctly. I decided to use a character date range instead of the year numeric I originally used. Thanks.

Schedule planing procedure

My family owns a medium sized transport company and when i came in the business 3 years ago we had no software to manage all the transports we had to do. With 20 drivers this was a problem, so i sat down, learned the basics of VBA and made an app trough excel to manage/dispatch the different trips by email to our different drivers. It "works" for now but we are planing a future expansion so i started learning Xojo (im on a mac, closest thing to VBA)
We receive a Excel file to tell us which trips we have to do one day ahead (we transport people). Basically, its a sheet with all the different customers. I import this sheet in a "week file" to use the data afterwards trough different macros. There is lot of irrelevant information in this sheet but the column we will be interested too are the Type, Number and Hour.
So basically, i have to take all my rows (100+), group them by type and number, then order them by hour.
Heres a quick example of what my sheet looks like when sorted (the different colours are different drivers):
I think my procedure to get this result is not really that good. I loop trough all the rows in a data sheet (which is hidden) with a If statement checking if its a new type or trip number, save the time and row reference (first row, last row) in an array, then loop trough the array to export the ranges on the display sheet. Keep in mind that i wrote this 3 weeks after learning that VBA existed. It "works" but id like to have a better process.
I will be using SQLite to store all the information in the application im starting to write. Id like to have suggestion as to how i could sort all my data faster using SQL. Im looking for a procedure, i can figure out a way to code it.
Heres a sample of the code i made.
For RowSearch = 2 To RowCount
If Sheets(DataSheetName).Cells(RowSearch, 2).Value <> Sheets(DataSheetName).Cells(RowSearch - 1, 2).Value _
Or Sheets(DataSheetName).Cells(RowSearch, 3).Value <> Sheets(DataSheetName).Cells(RowSearch - 1, 3).Value Then
Blocks(TripCount, 1) = Position
Blocks(TripCount, 2) = RowSearch - 1
Blocks(TripCount, 3) = Format(Sheets(DataSheetName).Cells(Position, 4).Value, "hh:mm")
TripCount = TripCount + 1
Position = RowSearch
End If
Next RowSearch
Blocks(TripCount, 1) = Position
Blocks(TripCount, 2) = RowSearch - 1
Blocks(TripCount, 3) = Format(Sheets(DataSheetName).Cells(Position, 4).Value, "hh:mm")
'Sorts the blocks by time, loops trought the trips row range to sort the trips by time and type and writes the blocks
RowSelect = 1
For BlockSearch = 1 To TripCount
TempHour = "99:99"
For RowOrder = 1 To TripCount
If Blocks(RowOrder, 3) <= TempHour Then
TempHour = Blocks(RowOrder, 3)
Trips(BlockSearch, 1) = Blocks(RowOrder, 1)
Trips(BlockSearch, 2) = Blocks(RowOrder, 2)
RowChange = RowOrder
End If
Next RowOrder
RowRange = Trips(BlockSearch, 2) - Trips(BlockSearch, 1) + 1
FieldValue = Sheets(DataSheetName).Range("A" & Trips(BlockSearch, 1) & ":" & "R" & Trips(BlockSearch, 2))
Sheets(SheetName).Range("A" & RowSelect & ":" & "R" & RowSelect + RowRange - 1) = FieldValue
Sheets(SheetName).Rows(RowSelect + RowRange).Insert Shift = xlDown
RowSelect = RowSelect + RowRange + 1
Blocks(RowChange, 3) = "99:99"
Next BlockSearch
In SQL, "grouping" is an operation that not only partitions the rows into groups, but also aggregates all a group's rows to create a single output row for each group.
In your example, the rows are simply sorted by type, number, and hour, which would require a query like this:
SELECT *
FROM MyTable
ORDER BY Type, Number, Hour

Resources