Is this loop redundant? - sqlite

The three tables of interest are:
Event, containing various details of, eg, the berlin marathon
Result, containing various fields including user's race time and a FK to an Event, and
Goal, with a FK to the Event the user would like to run, a field for the time they'd like to run it in, and eventually a FK to the Race at which the user achieved their goal.
Obviously, the Event of the Race where the user achieved their goal has to be the Event of the Goal. But not all Goal's have been achieved -- some may never be.
Is this bad design? Can anybody suggest a better way of modelling this problem? I'm using sqlite in a django project.

Your Event table is OK.
But your Goal Table design messed up the proposed event and the actual achieved event.
I think Result Table can be merged with Goal table into a new Result table.
Since one user may want to run multiple events. In your new Result table, it should be like:
UserID EventID TimeProposed ActualTimeUsed Achieved
1 1 1 hour 1.1 hour No
1 2 1.5 hour 1.2 hour Yes
So the loop you mentioned is removed since each row has only one event. (The UserID and EventID remains to be the FK to the other two tables.)
The Achived column can be updated using a query to decide ActualTimeUsed<=TimeProposed.

Related

How do I create a running count of outcomes sequentially by date and unique to a specific person/ID?

I have a list of unique customers who have made transactions over a year (Jan – Dec). They have bought products using 3 different methods (card, cash, check). My goal is to build a multi-classification model to predict the method pf payment.
To do this I am engineering some Recency and Frequency features into my training data, but am having trouble with the following frequency count because the only way I know how to do it is in Excel using the Countifs and SUMIFs functions, which are inhibitingly slow. If someone can help and/or suggest another solution, it would be very much appreciated:
So I have a data set with 3 columns (Customer ID, Purchase Date, and Payment Type) that is sorted by Purchase Date then Customer ID. How do I then get a prior frequency count of payment type by date that does not include the count of the current row transaction or any future transactions that are > the Purchase Date. So basically I want to do a running count of each payment option, based on a unique Customer ID, and a date range that is < purchase date of that training row. In my head I see it as “crawling” backwards through the transactions and counting. Simplified screenshot of data frame is below with the 3 prior count columns I am looking to generate programmatically.
Screenshot
This gives you the answer as a list of CustomerID, PurchaseDate, PaymentMethod and prior counts
SELECT CustomerID, PurchaseDate, PaymentMethod,
(
select count(CustomerID) from History T
where
T.CustomerID=History.CustomerID
and T.PaymentMethod=History.PaymentMethod
and T.PurchaseDate<History.PurchaseDate
)
AS PriorCount
FROM History;
You can save this query and use it as the source for a crosstab query to get the columnar format you want
Some notes:
I assumed "History" as the source table name - you can change the query above to use the correct source
To use this as a query, open a new query in design view. Close the window that asks what tables the query is to be built on. Open the SQL view of the query design - like design view, but it shows the SQL instead of the normal design interface. Copy the above into the SQL view.
You should now be able to switch to datasheet view and see the results
When the query is working to your satisfaction, save it with any appropriate name
Open a new query in design view
When you get the list of tables to include, switch to the list of queries and include the query you just saved
Change the query type to crosstab and update the query as needed to select rows, columns and values - look up "access crosstab queries" if you need more help.
Another tip to see what is happening here:
You can take the subquery - the parts inside the () above - and make
just that statement into it's own query, excluding the opening and closing (). Then you can look at it's design view to see what it does
Save it with an appropriate name and put it into the query above in place of the statement in () - then you can look at the design view.
Sometimes it's easier to visualize and learn from 2 queries strung together this way than to work with sub queries.

Dynamodb data model for process/transaction monitoring

I am wanting to keep track of multi stage processing job.
Likely just need the following fields
batchId (guid) | eventId (guid) | statusId (int) | timestamp | message (string)
There are relatively small number of events per batch.
I want to be able to easily query events that have a statusId less than n (still being processed or didn't finish processing).
Would using multiple rows for each status change, and querying for latest status be the best approach? I would use global secondary index but StatusId does not seem like a good candidate for hashkey (less than 10 statuses).
Instead of using multiple rows for every status change, if you updated the same event row instead, you could use a technique described in the DynamoDB documentation in the section 'Use a Calculated Value'. Basically this would involve adding another attribute (say 'derivedStatusId') which would be derived by appending a random number to statusId at the time of writing to DynamoDB. For example, for a statusId of 2, derivedStatusId could be one of {"2-00", "2-01", .. "2-99"}. Setting up a Global Secondary Index on derivedStatusId would give you some fan-out that will help in preventing the index from becoming hot.
If you are sure that you will use this index for only unfinished events, then removing the derivedStatusId attribute from the record when it transitions to a finished status will remove it from index as well - which may be a good property if events are expected to finish processing eventually, and if they stay around forever. This technique is called "Sparse Index" and is described in more detail here.
From your question, it seems like keeping status history recording is a desired property (I assume this because you want to have multiple rows for status changes). Consider putting this historical information in the same row. DynamoDB supports list data types and also has a generous 400KB item limit which may just allow you to capture all the desired historical information in the same record.

Crystal Report with Multiple Tables - Empty or Cartesian Product

I know this has been asked before..sort of. And that's why I'm posting. Basically I'm building a report in Crystal that relies, to keep this simple, at least 3 tables.
Table A is inner joined to table B by a unique ID. Table B has a child table that may or may not have data related to this unqiue ID.
As a general example table A is a customer table, table B is a product table and the child table is contains the product number. All customers have a product, but not all customers have product number in the child table. I hope I've explained that simply enough.
My issue is sort of between Crytal and Access and how to query this. When I'm writing behind something in VB it's easy enough to write and execute a query and display the result in the desired manner. However I can't seem to get my query straight... I either end up with a report with cartesian product as the resultset, which displays ok...except that even with the few records I have ends up being about 30k pages..or I end up with a blank dataset because the child table does not have corrisponding data to B.
Using outter joins I've managed to get my results within some amount of reason but not acceptable to a real world report. I'm sure this issue has come up but I can't seem to find any suitable answers and to be honest I'm not even sure what questions to ask being a Crystal n00b.
What I'm really after is the data from Table A, the data from Table B and children tables. While they are logically linked and can be linked with the ID field, it isn't necessary I don't think because I am taking a parameter value for the report of the ID field. And once the tables are filtered, no other action needs to be taken except to dump them back on the report.
So can anybody point me in the right direction? Can I set up individual datasoruces (unrelated) based perhaps in a seperate section? Should I build a tree of queries and logic in my DB to get what I need out? I've been racking my brain and can't seem to find the right solution, any and all advice is apreciated and if I can clarify anything or answer any questions I will.
Thanks in advance.
As per requested below:
Section1
ID fname lname
01 john smith
Section2
ID notifiedDate notifiedTime
01 10/10/2012 12:35PM
S2childAdmin
ID noteName
01 jane doe
This data is logically related and can be related in the DB. However it is not necessary as long as the ID parameter is passed to each table. Querying Section1 inner joined with Section2 works fine. But any other arrangements result in more rows than required and I end up with a report many times duplicated. What I really need is something like Section1 joined with Section2 and S2childAdmin as a freely availble table. Otherwise it multiplies my data or results in a null recordset (because it can return 0 rows)
I think this should help point you in the right direction, though it has been 5 years or so since I did heavy Crystal Reports work.
One option might be to join everything using Outer Joins like you stated you were, then use a Crystal Report 'group' on the Table A ID, with a group based upon Table B ID inside of that. So you would, in the actual 'Detail' area put your table C details if there were any, and then use the Group header/footer for Table A and Table B to show data specific to those objects.
Another possible solution that may fall short of your requirements but might get you thinking in another way, is to create your main report and in it, display the fields from table A. Then below those fields include a sub-report and pass in the unique ID from Table A. You will then have a query inside of the subreport that finds all of the Table B records with that Table A.ID value and displays their details.
At this point you run into a weakness of Crystal Reports (at least as of the last version I used) in that you cannot have a subreport inside of a subreport.

asp.net multi-row update with gridview control without looping (vb.net)

I'm pretty new to asp.net. I'm building my first real app as a test. using SQL Server 2008 RS, VS Express 2012, IIS7.x and asp.net4.0
We receive Fedex shipment info every night that gets inserted into an MSSQL DB using SSIS. We then import the invoice, do variance matching and book costs to jobs based on the job no in Ref1 field. All this works great. However, the shipping department are supposed to put the ONLY the JobNo in the ref1 field. Of course they don't and there a lot of temps and shipping stations, so we need to fix the data. They'll put JobNo followed by junk, or junk and then the job no. When the costing people are looking at the invoice it's usually obvious what the job no is (e.g. "Samples for job 123" should be "123"). There can be many rows with the same Ref1 that needs editing (e.g 20 cartons with same Ref1). I have an SP with 3 params (OldRef, NewRef, invNo) that updates the Ref no for all records on that invoices:
UPDATE InvoiceLines
SET REF1 = #NewRef1
WHERE InvNo = #InvNo and Ref1 = #OldRef1
I figured a GridView (with an sqldatasource) would be a nice way to present the data. I only show rows where the Ref1 field is invalid, as the user corrects them, the no of records reduces.
I want the user to select a row, edit the Ref1 value and I'd just get the (old) RefNo of the selected row, it's new value, and call my SP with those and the InvNo (from a DropDown that filtered the invoice lines table).
Turns out to be way more difficult/inefficient that I thought.
All the examples I found to do this type of thing, make the user click all the rows and them loop through every row to do an update. Talk about slow and painful. I want to execute a single SP and have all matching rows updated and then refresh the list.
So what I'd like to figure out is how to get the OLD Ref1 (value in Ref1 before the edit - like deleted in an SQL trigger), the NEW Ref1 (edited value that the user typed in Ref1 - like inserted in an SQL trigger) and execute my SP and then refresh the table with the updated value result set.
Am I better off with something other than a grid view, or just using something other than the built in Edit command?
If I do figure out how to do the update, how do I refresh the GridView.
Can anyone point me in the right direction?
On another note, I'm on the fence about switching to C#. Most of the examples I'm finding are in C#. I learned C++ many years ago and read up on C# at weekend. It doesn't seem too difficult. I did find a Microsoft white paper and it pretty much said there's little difference between VB and C# so no real reason to switch. My colleagues do not know C or C#, so I'm just a bit concerned that in the unlikely event they need to help out, they'll be stuck. Any thoughts on this?
Regards
Mark

Qt: QSqlTableModel + QTableView sync with PostgreSQL

I'm writing a database access app for storing some data and want to ask a few questions about the model/view architecture.
(Using: Qt 4.7.4, own build; PostgreSQL 9.0; Targets: WinXP, Win7 (32/64 bit))
Let me first explain what I am trying to achieve and where I am currently.
I have two pages (subclassed QWidgets inserted in a QStackedWidget) with a QTableView bound to a model. Each view is bound to a table in the PostgreSQL server. You can add/edit/delete/sort/filter items.
Each page can be seen by only one type of users, lets call the roles Role1 and Role2.
The submit strategies of everything connected to the model are OnManualSubmit.
(Transaction isolation level = Serializable.) When two users want to edit(for example) the same row, I want to do a "SELECT ... FOR UPDATE" query - to make sure that when someone edits something, he will merge his changes with newer ones (if any, just like in SVN for example). But I see only a submitAll() method the QSqlTableModel.
Maybe catching the signals beforeUpdate(), beforeDelete(), beforeInsert() and performing manually "SELECT ... FOR UPDATE" is one option.
The other way I think is to subclass QSqlTableModel. What is the clean and nice way to achieve this?
I want to periodically update the QSqlTableView for each of the pages (one page is seen at most, Role1 users have access only to Page1 and the same for Role2 => Page2).
The first thing that came to my mind is to use a QTimer and manually call select() of the QSqlTableModel, but... not sure if this is the cool way.
I also want to periodically check if the connection to the database is ok, but I think that a QTimer + QSqlDatabase::isOpen () will do.
Now, the 2 tables have the same primary keys and some columns are the same. I want when a user with Role1 changes a row in Table1 to automatically change corresponding columns of Table2 and vice versa. Should I create a trigger in Postgres?
BTW, the database is small - each of the two tables is around 3-4000 rows with ~10 columns (varchars mostly, 1 text and 2 date colunms).
Thanks for reading and Happy New Year! :)
I think you should consider doing something of the following:
Instead of using QSqlTableModel as a model I'd implement my own model as a subclass of QAbstractTableModel. This will allow you a lot of control over what you can do in terms of data manipulation.
One thing that this will require is for certain fields in the table you would need to implement subclass of QAbstractItemDelegate that will allow for modification of data in the table as I am fairly sure you don't want to allow users updating any field in the table as for example primary key is likely have to be left alone.
For question 2 I would suggest implementing a field called transaction_counter for every row so you don't have to select every row in the table just the updated ones the transaction_counter will be updated on every row update and the new one will be inserted on the new row insert. One thing that will be required is that the counter is unique across the table. For example if initial state of the table is: row1 has counter = 0 and row2 has counter = 0. If row1 is updated counter set to 1. When row1 is then updated again counter on it is set to 2. When row2 is now updated counter on it is set to 3, etc. You can certainly do the data refreshes now using QTimer and this will be much more advantageous to for example checking the data as one user may be updating the same table as another user with the same Role.
For Question 3. I don't see any reason why not custom models and especially if you decide to separate data from the model you can manipulate data separately from it's display. Sort of Data->Model->View->Controller implementation. Each one can be maintained separately as long as you have a feedback mechanism for your delegates.
For Question 4. The answer is sure or you can implement the trigger in your application.
Hope this helps. Have a great New Year!

Resources