sqlite selection help needed - sqlite

I have the following bill table
building name amount payments receiptno
1234 name a 123 0 0
1234 name a 12 10 39
1234 name a 125 125 40
1235 name a 133 10 41
1235 name b 125 125 50
1234 name c 100 90 0
I want to select rows that amount minus payments is greater than zero and display the max value of receiptno
so I want to select only the following from building 1234
name a 39
name c 0
How can I do this?

Translating your description into SQL results in this:
SELECT building,
name,
MAX(receiptno)
FROM BillTable
WHERE amount - payments > 0
GROUP BY building,
name

Related

How to sum column based on value in another column in two dataframes?

I am trying to create a limit order book and in one of the functions I want to return a list that sums the column 'size' for the ask dataframe and the bid dataframe in the limit order book.
The output should be...
$ask
oid price size
8 a 105 100
7 o 104 292
6 r 102 194
5 k 99 71
4 q 98 166
3 m 98 88
2 j 97 132
1 n 96 375
$bid
oid price size
1 b 95 100
2 l 95 29
3 p 94 87
4 s 91 102
Total volume: 318 1418
Where the input is...
oid,side,price,size
a,S,105,100
b,B,95,100
I have a function book.total_volumes <- function(book, path) { ... } that should return total volumes.
I tried to use aggregate but struggled with the fact that it is both ask and bid in the limit order book.
I appreciate any help, I am clearly a complete beginner. Only hear to learn :)
If there is anything more I can add to this question so is more clear feel free to leave a comment!

R group data into equal groups with a metric variable

I'm struggeling to get a good performing script for this problem: I have a table with a score, x, y. I want to sort the table by score and than build groups based on the x value. Each group should have an equal sum (not counts) of x. x is a metric number in the dataset and resembles the historic turnover of a customer.
score x y
0.436024136 3 435
0.282303336 46 56
0.532358015 24 34
0.644236597 0 2
0.99623626 0 4
0.557673456 56 46
0.08898779 0 7
0.702941303 453 2
0.415717835 23 1
0.017497461 234 3
0.426239166 23 59
0.638896238 234 86
0.629610596 26 68
0.073107526 0 35
0.85741877 0 977
0.468612039 0 324
0.740704267 23 56
0.720147257 0 68
0.965212467 23 0
a good way to do so is adding a group variable to the data.frame with cumsum! Now you can easily sum the groups with e. g. subset.
data.frame$group <-cumsum(as.numeric(data.frame$x)) %/% (ceiling(sum(data.frame$x) / 3)) + 1
remarks:
in big data.frames cumsum(as.numeric()) works reliably
%/% is a division where you get an integer back
the '+1' just let your groups start with 1 instead of 0
thank you #Ronak Shah!

SQLite - performing calculation on preceding rows

The problem is this. If I have a quantity in any location, I want to perform the calculation below to each member of that job_no.
The idea is that if there's a quantity in loc3, the same quantity was previously in loc1 and loc2.
So, how do I get 10 in loc1 and loc2 may be another way to put it..?
select s.job_no, s.part, s.location, s.qty,
coalesce(ptime.setup_time, '-') as setup_time,
coalesce(ptime.cycle_time, '-') as cycle_time,
ci.rate
from stock as s join part_timings as pt
on pt.part = s.part
join locations as l on s.location = l.location
left join part_timings as ptime on s.part = ptime.part
and ptime.location = s.location
join costs_internal as ci
group by s.part, s.location
order by s.part, l.stage
job_no | part | location | qty | setup_time | cycle_time | rate | total
123 p1 loc1 0 60 30 0.5 ?
123 p1 loc2 0 30 15 0.5 ?
123 p1 loc3 10 60 15 0.5 ?
123 p1 loc4 0 60 15 0.5 ?
123 p1 loc5 0 60 15 0.5 ?
123 p1 loc6 0 60 15 0.5 ?
123 p1 loc7 20 60 15 0.5 ?
calculation to get total:
coalesce(round((pt.cycle_time * s.qty * ci.rate) +
(pt.setup_time * ci.rate), 2), '-')
EDIT:
I've added loc4 to loc7.
loc3 would need to have the calculation applied to loc1 and loc2 (qty 10).
loc7 would need to have the calculation applied to all locations that are before it (qty 20).
Maybe I'm not explaining it perfectly, struggle to get my intentions across sometimes with SQL!
Using a simplified version of your data...
select * from stock;
job_no qty location
---------- ---------- ----------
123 0 loc1
123 0 loc2
123 10 loc3
123 0 loc4
456 0 loc1
456 20 loc2
You can use a sub-select to get the quantity for each job and join with it to get the stock for each job.
select stock.*, stocked.qty
from stock
join (select * from stock s where s.qty != 0) as stocked
on stock.job_no = stocked.job_no;
job_no qty location qty
---------- ---------- ---------- ----------
123 0 loc1 10
123 0 loc2 10
123 0 loc4 10
123 10 loc3 10
456 0 loc1 20
456 20 loc2 20
stocked has the row for each job which is currently stocked.
Note that unless you've made a restriction, there may be more than one stocked row for a job.
loc7 would need to have the calculation applied to all locations that are before it (qty 20).
With this data...
sqlite> select * from stock order by job_no, location;
job_no qty location
---------- ---------- ----------
123 0 loc1
123 0 loc2
123 10 loc3
123 0 loc4
123 0 loc5
123 0 loc6
123 20 loc7
456 0 loc1
456 20 loc2
To accomplish this, instead of joining on the subselect do it on a per column basis else we'll get multiple values stocked locations. (There's probably also a way to do it with a join)
In order to make sure we select only previous locations (or our own) it's necessary to check that stock.location <= stocked.location. In order to ensure we get the closest one, order them by location and select only the first one.
select stock.*, (
select stocked.qty
from stock stocked
where stock.job_no = stocked.job_no
and qty != 0
and stock.location <= stocked.location
order by stocked.location asc
limit 1
) as stocked_qty
from stock
order by job_no, location;
job_no qty location stocked_qty
---------- ---------- ---------- -----------
123 0 loc1 10
123 0 loc2 10
123 10 loc3 10
123 0 loc4 20
123 0 loc5 20
123 0 loc6 20
123 20 loc7 20
456 0 loc1 20
456 20 loc2 20
This may be inefficient as a column subselect. It's important that job_no, qty, and location are all indexed.

adding and subtracting values in multiple data frames of different lengths - flow analysis

Thank you jakub and Hack-R!
Yes, these are my actual data. The data I am starting from are the following:
[A] #first, longer dataset
CODE_t2 VALUE_t2
111 3641
112 1691
121 1271
122 185
123 522
124 0
131 0
132 0
133 0
141 626
142 170
211 0
212 0
213 0
221 0
222 0
223 0
231 95
241 0
242 0
243 0
244 0
311 129
312 1214
313 0
321 0
322 0
323 565
324 0
331 0
332 0
333 0
334 0
335 0
411 0
412 0
421 0
422 0
423 0
511 6
512 0
521 0
522 0
523 87
In the above table, we can see the 44 land use CODES (which I inappropriately named "class" in my first entry) for a certain city. Some values are just 0, meaning that there are no land uses of that type in that city.
Starting from this table, which displays all the land use types for t2 and their corresponding values ("VALUE_t2") I have to reconstruct the previous amount of land uses ("VALUE_t1") per each type.
To do so, I have to add and subtract the value per each land use (if not 0) by using the "change land use table" from t2 to t1, which is the following:
[B] #second, shorter dataset
CODE_t2 CODE_t1 VALUE_CHANGE1
121 112 2
121 133 12
121 323 0
121 511 3
121 523 2
123 523 4
133 123 3
133 523 4
141 231 12
141 511 37
So, in order to get VALUE_t1 from VALUE_t2, I have, for instance, to subtract 2 + 12 + 0 + 3 + 2 hectares (first 5 values of the second, shorter table) from the value of land use type/code 121 of the first, longer table (1271 ha), and add 2 hectares to land type 112, 12 hectares to land type 133, 3 hectares to land type 511 and 2 hectares to land type 523. And I have to do that for all the land use types different than 0, and later also from t1 to t0.
What I have to do is a sort of loop that would both add and subtract, per each land use type/code, the values from VALUE_t2 to VALUE_t1, and from VALUE_t1 to VALUE_t0.
Once I estimated VALUE_t1 and VALUE_t0, I will put the values in a simple table showing the relative variation (here the values are not real):
CODE VALUE_t0 VALUE_t2 % VAR t2-t0
code1 50 100 ((100-50)/50)*100
code2 70 80 ((80-70)/70)*100
code3 45 34 ((34-45)/45)*100
What I could do so far is:
land_code <- names(A)[-1]
land_code
A$VALUE_t1 <- for(code in land_code{
cbind(A[1], A[land_code] - B[match(A$CODE_t2, B$CODE_t2), land_code])
}
If I use the loop I get an error, while if I take it away:
A$VALUE_t1 <- cbind(A[1], A[land_code] - B[match(A$CODE_t2, B$CODE_t2), land_code])
it works but I don't really get what I want to get... so far I was working on how to get a new column which would contain the new "add & subtract" values, but haven't succeeded yet. So I worked on how to get a new column which would at least match the land use types first, to then include the "add and subtract" formula.
Another problem is that, by using "match", I get a shorter A$VALUE_t1 table (13 rows instead of 44), while I would like to keep all the land use types in dataset A, because I will have then to match it with the table including VALUES_t0 (which I haven't shown here).
Sorry that I cannot do better than this at the moment... and I hope to have explained better what I have to do. I am extremely grateful for any help you can provide to me.
thanks a lot

SQLite substituting values where JOIN doesn't find a match

I'm new to SQLite and I'm trying to create a view of joined tables, where table_results references table_names. The user can add, edit and remove items from table_names, but they can not change the references within table_results. What I'm trying to accomplish is that if an entry in table_names is removed, during a JOIN table_names ON (table_results.name_id=table_names._id) will return all rows in table_results, but where a name entry is removed will display "NO NAME"
Example:
table_names:
_id name
1 John
2 Bill
3 Sally
4 Nancy
table_results:
_id name_id score_1 score_2
1 1 50 75
2 4 80 60
3 2 83 88
4 3 75 75
5 2 93 95
where:
Select table_results._id table_names.name, table_results.score_1, table_results.score_2
FROM table_results
JOIN table_names ON (table_reults.name_id = table_names._id);
produces:
1 John 50 75
2 Nancy 80 60
3 Bill 83 88
4 Sally 75 75
5 Bill 93 95
Now if the user was to remove Bill from table_names, the same query string would produce:
1 John 50 75
2 Nancy 80 60
4 Sally 75 75
My Question:
Is there a way to have the query that substitutes values when the join doesn’t find a match? I’d like the above example to produce the following output, but I’m not sure how to write the query string for SQL.
1 John 50 75
2 Nancy 80 60
3 NO NAME 83 88
4 Sally 75 75
5 NO NAME 93 95
Thanks for your help.
Sure, you are doing an inner join, just change it up to a left outer join..
SELECT table_results._id
,COALESCE(table_names.name, 'NO NAME') AS [name]
,table_results.score_1
,table_results.score_2
FROM table_results
LEFT OUTER JOIN table_names
ON (table_reults.name_id = table_names._id);

Resources