Remove Selected Items from Search Results - apache-flex

Use Case:
End-User searches for something and an ArrayCollection is returned with Result objects. This is displayed in a data grid.
End-User selects a few of the search results and "moves" it over to another datagrid for use later.
End-User does another search.
PROBLEM:
Some of the search results might contain something the user already previously selected and moved over to the second datagrid. I want to remove these from the second search result.
How can I do this quickly, and efficiently in Flex code?

disableAutoUpdate() on both array collection
loop through the first one and for each item of the second remove it if it's present in the first one (or adapt the algorithm based on what you really want - unsure)
enableAutoUpdate() at the end.
Looping through array collection can be quick if no events are dispatched.
Second option, you could also loop through a cheap copy made up of an array, which is arraycollection.source.concat(), or even a vector if all your items are of the same type. That will give the maximum speed, but you might lose in the long run as you need to convert back to an array collection at the end.
So I would stick to the first option.

For the time being, I've implemented a hash collection (extends ArrayCollection). Hash only allows unique values, so in the end, it serves my purpose even though the UI might be confusing to the user. Will probably implement the above method at a later date. :)

Related

JDO/DataNucleus: map/calculate derived field

Question in short: How can you map/calculate a derived field using JDO/DataNucleus?
Example
An Order can have one or more Items. The field totalItemAmount is the sum of all Items and their amounts. totalItemAmount should not exist as a field in the datastore, but should be calculated.
With Hibernate one could use #Formula to annotate totalItemAmount- see https://stackoverflow.com/a/2986354/2294031 .
Is there an equivalent for JDO/DataNucleus?
Workarounds
Because I have not found anything yet, I considered using alternative approaches. But I am not sure which one would be appropriate.
Implementing totalItemAmount as a method: The total amount of items could be calculated with a method (eg. Order.getTotalItemAmount()). The method iterates over all Items of the Order and sums up the amount of each Item. But I imagine this approach would be very slow if I want to display an overview of many orders. Because each time getTotalItemAmount() gets called, all Items of the Order will be (unnecessarily) fetched.
Defining a custom query: Is it possible to define a custom query, which will be used, when DataNucleus obtains Orders from the datastore?
Treating totalItemAmount as a "normal" column (like number): totalItemAmount will be an integer column and everytime the list of Items from the Order gets updated, the totalItemAmount will be updated also. But I do not like this approach, because it could lead to inconsistency - If the order gets modified outside the context (eg. using plain SQL), the content of totalItemAmount could be wrong.
Using a SQL view: I could define a view as described in Hibernate Derived Properties - Performance and Portability. But this would introduce a considerable amount of work and future maintenance - imho too much for the gain.
Is there another way to solve this problem?
Off-Topic: Feel free to comment on my writing, as I really would like to improve it.

Selecting an item from a very large list

Suppose I have a list of a couple of thousand organizations and a user needs to be able to select one of them. The list is too large to populate in a dropdown at page load, and the user often knows what they want but it's not the first part of the organization name. That is, they know "Collections" but not that the precise name of the organization is "Department of Collections". So the user will need/want to type in some information.
It's easy enough to use an autocompleting textbox of some kind, but I don't want to allow the user to type in random text - they have to choose one of the organizations explicitly.
What's the best solution?
IMO I will simplify the UI to:
a textbox to enter the string
a drop down to set the filter options like: "contains | starts with | ends with"
a button "Find"
Then, I will populate a view based on the search string & let the user choose the valid item or refine the search
IMO with something like an auto-complete, you will end up writing a lot of parsing code to get to the string & then there might be server-side load considerations...
HTH.
In additional check if 'facetted navigation' is something you need. Ref.: http://www.alistapart.com/articles/design-patterns-faceted-navigation/
So it seems to me your main challenges are to
Express that the user needs to select an organization from the list (and only from the list).
Express that there are a lot of organizations on the list.
Provide some means for the user to quickly find the organization on the list.
I would say present a selector control that fits in with the rest of your design with a search box just above it. You should then page the list as there will be lots of pages with that many elements indicating that the user should definitely use the search. The search essentially acts like the auto complete, but instead of the found options changing the text, it will change the contents of the paginated list. If you do this on a character by character basis (or throttle using Reactive Extensions), it's very clear that you're just filtering the list to make selection easier.
You could use a CustomValidator to ensure that the TextBoxes content in contained in your collection.
You could use the Ajax AutoComplete Control: http://www.asp.net/ajax/ajaxcontroltoolkit/Samples/AutoComplete/AutoComplete.aspx. You can opt to only do a lookup if the user has typed in a certain number of characters.
You'd create a static Web Method to query the collection (you could use LINQ) and return matching organizations.
You'd obviously need to validate the textbox input afterwards.
Is it possible to structure your list a bit more like a tree, so that it is not a single list. E.g. Could you have a grouping like "Government Depts" and then add Dept of Collections to that. Then ask you users to first select the top level grouping then show them a shorter lists of organizations in that group?
It sounds to me as if your data list should really be in either a database or at least stored well away from the UI.
Wherever its really stored, place a keyword for each entry, say "Collection". The list of keywords could be available as part of your auto-complete functionality. Then search on the keyword alone.
If you could divide items in categories, would using some kind of tree control help?
So, when user clicks on a node you load only items in that node. And so on.
I'd break it into two paths...
Use an autocompleting textbox, for the person who types the correct title (i.e., Department of Collections); and a separate search button to search for possible matches. The search button would take you to a results page to select the desired choice. This functionality would be similar to the way search on MSDN works.
Initially a tree view sounds cool, but are you certain that a single classification will reduce the data into manageable sets? If 80% of your data gets classified as "government dept" this doesn't really help things.
The problem is you want criteria that allows users to quickly split a large list into smaller sets that are easier to consume. Additionally, there should be enough flexibility to react to changes in data.
I'd suggest using a tagging pattern like iTunes. In my library "rock" describes 80% of my collection - but is still a useful categorization for something like random shuffle. I also have the ability to stack tags so I can use genre="rock", decade="1990" and quickly sift my data down to whatever is of interest.
In the UI, I'd recommend a section that allows the user to apply "filters" which is nothing more than selecting specific values for tags. Break the list out into pages and allow them to see a tally of potential matches.
Scenerio:
- Navigate to screeen XYZ and see there are 10,000 companies to pick from
- Click "classification" and select "Government dept" and the list updates to indicate there are now 1,000.
- Click "region" and select "South" and see my list drop to 200.
- Sort list by name and then select (or scroll through, whatever)

Caching ListView data a viable option?

Here's my scenario:
1) User runs search to retrieve values for display in ListView via LinqDataSource.
2) They click on one of the items which takes them to another page where the details can be examined, further drill-down can happen, etc.
3) User wants to go back to the original ListView results to select another item for inspection.
I can see it's possible to pass the querystring params around, allowing the querying to be duplicated each time the user comes back to the ListView, but it seems like there ought to be a way to cache the results.
Since I'm using the LinqDataSource, though, I believe the actual results are fetched each time the query is run. I'm currently feeding a "select new {blah, blah}" type of IEnumerable to the e.Results, which can't be turned into a List since it's populated with anonymous types.
In short:
1) Does it make sense to try to place potentially large query results in the users session?
2) If it does, is a List the reasonable data structure?
3) Do I need to resort to something like creating a class with the correct properties to hold the anonymous data, enumerate the query return, populate the List?
4) Is there a better option than the LinqDataSource for this type goal?
5) Or, does it just make more sense to run the query each time they hit the ListView?
I apologize if this wasn't clear. I would really appreciate it if someone can set me straight before I nuke a bunch of my free time headed down the wrong path :)
First, I would suggest that you look into the caching mechanism that comes with ASP.NET, unless the data is private for a certain user.
Second, I would suggest that you design your application in a way so that you create natural points where you could try to get data from a cache before querying the database (and insert data into the cache, with expiration rules), but don't start putting stuff into the cache until you have verified that it will actually make a difference.
Measure how much time that is actually spent on retrieving data and use caching in the cases where it makes a difference.
I'm not sure if resurrecting threads from the dead is cool on SO, but here is what I found to answer this question:
http://weblogs.asp.net/pwelter34/archive/2007/08.aspx

Allowing nulls vs default values

I'm working on an ASP.NET project that replaces many existing paper forms. One of the requirements is that the user can save the form in any state, i.e. they could create a new blank form and immediately save it with no data or with partial data. I'm validating for data type on every save but validation for required fields does not occur until the user marks the form as completed.
I'm not sure what the best approach is to handle this requirement in the database and domain model. As I see it, I have two options:
Allow nulls for any field that may not have data. This feels like the "correct" approach but it requires that almost every database field allow nulls and I have to code around a lot of nullable types. Also, when the form is finalized none of the required fields are enforced in the database.
Populate my business objects with meaningful default values. In some cases, there are meaningful default values for many (but not all) fields that I could use. This approach verges on "magic numbers" which makes me uncomfortable.
Which approach is best? Or is there a third way? I'm not willing to go to extremes, such as splitting the tables.
Edited to add: I wanted to expand on this a bit since I accepted a response. The primary reason that I'm not interested in splitting the tables is that once a project is submitted, the data on the forms is used to generate data for another system that is the system of record. At that point the original form data is unlikely to be revised or used for reporting.
I don't understand why you don't want to split the tables. I don't know what domain you're in but in any I could imagine there are two classes of people:
people who have submitted the form
people who haven't
And as a business executive I don't care about the second. But the first I care deeply about, and they need to have all their data in correctly.
It also improves efficiency - most of your queries about aggregate data will be over the first table, not the second. The second table will only be used for index seeks.
If splitting the table(s) (are there more than one?) is not an option, I would consider creating single table to store serialisations of objects of incomplete forms, and only commit a form to the "real" tables when the form is fully submitted by the user.
If there isn't a sensible default, and you don't want to split the data, then nulls are almost certainly your best option. Re the db not being to verify that they are not null when completed... well, if you don't want to split the table there isn't much you can do (short of using a CHECK constraint, or an INSTEAD OF trigger to run validation). But the DB isn't the only place responsible for data validation. Your app logic can do that too.
You could use a temporary table with "allow nulls" on every column to store the form containing partial or no data and copy / move the data to the final table when the user marks the form as completed. This way, you do not depend on default values (which the user may forget to change), you can save in any state, and you still have the validation in the end.
This is a situation that cries out for split tables. I know you said you don't want to do that, and in a comment even said "this project doesn't warrant that level of effort". but it's really the best solution.
Set up preliminary table(s) with everything except your key nullable. When the user marks the form complete, and it passes validation, move it to the final table(s). not only is this The Right Thing To Do, but it's probably less effort than "coding around nullable values" when working with finished forms.
If you need to see all forms, finished or not, make a Union view.
I'd take the first option but add a column to the database tables so that when the form is completed this is flagged. Then for anything using the form data it merely needs to check that the form has been completed.
That's my suggestion for a way around this.
NULL values are not searchable by the indexes.
If you'll need to issue a query like "select first 10 forms with a certain field unfilled", this query will use a FULL TABLE SCAN which may be not efficient.
Oracle does not distinguish between NULL and empty string, but other databases do. You'll probably want to make an empty string to be the DEFAULT for unfilled fields and use it in a search.
If you don't need to search on unfilled fields, then just make them NULL.
NULL generally means "Don't Know" (in a database) whereas an empty string could actually represent an empty string.
I would tend to use NULL as the "Don't Know" value in your case. When you print out data you'll just have to assume that any NULL value means an empty string.
CHECK CONSTRAINT + VIEW
if you don't have a status field add one so you can tell that it is finished.
add a check constraint on that status field so it can't be marked finished if any of the columns are null.
When you write your queries on "finished" forms you can ignore checking for nulls everywhere if you do one of these two options:
just add Status="F"inished in the where clause
make a view of only finished ones
when using the "finished view" you don't have to do all the validation checks or worry about unfinished ones showing up in the results
I've had a similar situation, and while I haven't yet come up with a solution, I have been toying with the idea of just using simple XML serialization to store the temporary document data. If you generate simple classes that model the data in the objects (using nullable types where needed, perhaps), it would be easy to stuff data from the screen into those objects, serialize them to XML and then store them in a temporary "staging" table. When your users are done working and want to submit or finalize the document, then you perform all of your needed validation against the serialized data, eventually putting into the "real" table with the proper data structures and constraints.

Bulk Collection Manipulation through a REST (RESTful) API

I'd like some advice on designing a REST API which will allow clients to add/remove large numbers of objects to a collection efficiently.
Via the API, clients need to be able to add items to the collection and remove items from it, as well as manipulating existing items. In many cases the client will want to make bulk updates to the collection, e.g. adding 1000 items and deleting 500 different items. It feels like the client should be able to do this in a single transaction with the server, rather than requiring 1000 separate POST requests and 500 DELETEs.
Does anyone have any info on the best practices or conventions for achieving this?
My current thinking is that one should be able to PUT an object representing the change to the collection URI, but this seems at odds with the HTTP 1.1 RFC, which seems to suggest that the data sent in a PUT request should be interpreted independently from the data already present at the URI. This implies that the client would have to send a complete description of the new state of the collection in one go, which may well be very much larger than the change, or even be more than the client would know when they make the request.
Obviously, I'd be happy to deviate from the RFC if necessary but would prefer to do this in a conventional way if such a convention exists.
You might want to think of the change task as a resource in itself. So you're really PUT-ing a single object, which is a Bulk Data Update object. Maybe it's got a name, owner, and big blob of CSV, XML, etc. that needs to be parsed and executed. In the case of CSV you might want to also identify what type of objects are represented in the CSV data.
List jobs, add a job, view the status of a job, update a job (probably in order to start/stop it), delete a job (stopping it if it's running) etc. Those operations map easily onto a REST API design.
Once you have this in place, you can easily add different data types that your bulk data updater can handle, maybe even mixed together in the same task. There's no need to have this same API duplicated all over your app for each type of thing you want to import, in other words.
This also lends itself very easily to a background-task implementation. In that case you probably want to add fields to the individual task objects that allow the API client to specify how they want to be notified (a URL they want you to GET when it's done, or send them an e-mail, etc.).
Yes, PUT creates/overwrites, but does not partially update.
If you need partial update semantics, use PATCH. See http://greenbytes.de/tech/webdav/draft-dusseault-http-patch-14.html.
You should use AtomPub. It is specifically designed for managing collections via HTTP. There might even be an implementation for your language of choice.
For the POSTs, at least, it seems like you should be able to POST to a list URL and have the body of the request contain a list of new resources instead of a single new resource.
As far as I understand it, REST means REpresentational State Transfer, so you should transfer the state from client to server.
If that means too much data going back and forth, perhaps you need to change your representation. A collectionChange structure would work, with a series of deletions (by id) and additions (with embedded full xml Representations), POSTed to a handling interface URL. The interface implementation can choose its own method for deletions and additions server-side.
The purest version would probably be to define the items by URL, and the collection contain a series of URLs. The new collection can be PUT after changes by the client, followed by a series of PUTs of the items being added, and perhaps a series of deletions if you want to actually remove the items from the server rather than just remove them from that list.
You could introduce meta-representation of existing collection elements that don't need their entire state transfered, so in some abstract code your update could look like this:
{existing elements 1-100}
{new element foo with values "bar", "baz"}
{existing element 105}
{new element foobar with values "bar", "foo"}
{existing elements 110-200}
Adding (and modifying) elements is done by defining their values, deleting elements is done by not mentioning it the new collection and reordering elements is done by specifying the new order (if order is stored at all).
This way you can easily represent the entire new collection without having to re-transmit the entire content. Using a If-Unmodified-Since header makes sure that your idea of the content indeed matches the servers idea (so that you don't accidentally remove elements that you simply didn't know about when the request was submitted).
Best way is :
Pass Only Id Array of Deletable Objects from Front End Application To Web API
2. Then You have Two Options:
2.1 Web API Way : Find All Collections/Entities using Id arrays and Delete in API , but you need to take care of Dependant entities like Foreign Key Relational Table Data too
2.2. Database Way : Pass Ids to your database side, find all records in Foreign Key Tables and Primary Key Tables and Delete in same order i.e. F-Key Table records then P-Key Table records

Resources