How can add specific relationship or query types as an attribute on a Bookshelf Model? - bookshelf.js

I've been using KnexJS for a while, and want to transition to BookshelfJS, as my model classes on the server side are starting to get a little hairy, and why re-invent the wheel.
For a lot of my API server, what I want to do is pre-fetch a list of related models (a document has many and belongs to many editors) without necessarily pre-fetching the whole thing. Ideally, I'd end up with
document = {
id: 1
body: 'foobar'
editor_ids: [1, 2]
}
Now, I can do this by doing editors: belongsToMany(Profiles) on the Document definition, and then doing a fetch().withRelated(['editors']), but the problem there is that it returns the full Profile object on the fetch.
This generates an extraneous join (documents_editors join editors on editors.id = documents_editors.editor_id) that's not needed, and to conform to the spec my client app expects, (the IDs embedded and then the profiles themselves added later in the JSON response only optionally, and actually in practice never, because profiles tend to get cached and loaded elsewhere), I have to manually shove the editor_ids attribute in there by parsing through Document.relations, which also adds (a tiny tiny bit) of extra time.
So, ultimately, I can do what I want but it's not elegant. Ideally, there's something in BookshelfJS where I could do something like
Document = bookshelf.Model.extend
tableName: 'documents'
fancyValue: ->
#rawQuery 'select editor_id from documents_editors where document_id = ?', [#id]
Or build a knex-style query in there. I know in the above particular use case, a raw query is kind of overkill, but I do actually have some more annoying queries to run as well. I track user-community memberships, and permission grants on documents to communities, which means I use a postgres-style CTE to do something like
with usergroups as
(
select communities.id from communities inner join edges
on communities.id = edges.parent_id and edges.parent_type = 'communities'
and edges.child_id = ? and edges.child_type = 'profiles'
and edges.type = 'grant: comment'
)
select distinct documents.id as parent_id, 'documents' as parent_type
from documents inner join edges
on edges.parent_id = documents.id and edges.parent_type = 'documents'
and edges.type = 'grant: edit'
and documents.type = 'collection'
where (edges.child_type = 'profiles' and edges.child_id = ?) or
(edges.child_type = 'communities' and edges.child_id in (select id from usergroups))
(which finds all the documents of type 'collection' that the user in question can edit, either because they were directly added as an editor or because they belong to a community which was granted edit access).

A description of how I approached this problem can be found here:
https://gist.github.com/ericeslinger/a83e74501e9901c8b795
which basically amounts to "I wrote some custom behaviors and threw them into a subclass of bookshelf.Model that all of my application Model objects inherits from".

Related

Can SQLite return default values for non-existent columns instead of error?

I know how to use IFNULL to get default values for non-existent rows or null values, but for creating queries that are compatible with older schema versions, it would be nice to be able to do this:
Schema v1: CREATE TABLE Employee (Name TEXT, Phone TEXT)
Schema v2: CREATE TABLE Employee (Name TEXT, Phone TEXT, Address TEXT)
Theoretical backward compatible query:
SELECT Name, Phone, IFNULL(Address, '') FROM Employee
Obviously this doesn't work for a file created with schema v1. Is there some way to do this though?
There are 2 alternative workflows, but both are rather annoying. Either 1) update the old db by adding missing columns (which would start with null values); or 2) build the query code dynamically based on schema version.
Create a temporary view that references a particular schema, substituting default values (or even transforming other data) for individual columns which differ between the base schemas.
Sqlite views can even be made modifiable by defining appropriate triggers.
This still requires programming some conditional logic upon connection, but it would allow more uniform queries and interaction with different versions of the schema.
The suggested syntax would perhaps be convenient in some limited cases, but this approach is much more useful since it can be expanded beyond simple "if column exists" Boolean operations and instead could be used to perform dynamic transformation of one schema into another, perhaps joining tables and providing more advanced logic for updates of differing schema, etc.
Pseudo code mixed with view definitions to demonstrate:
db <- Open database connection
db_schema <- determine schema version
If db_schema == 1 Then
db.execute( "CREATE VIEW temp.EmployeeX AS
SELECT Name, Phone, '' AS Address
FROM main.Employee;" )
Else If db_schema == 2 Then
db.execute( "CREATE VIEW temp.EmployeeX AS
SELECT Name, Phone, Address
FROM main.Employee;" )
End If
#Later in code
data <- db.getdata("SELECT Name, Address
FROM EmployeeX")
If you're really averse to conditional statements for the schema this may still be annoying, but it would at least reduce/eliminate conditional statements throughout the code--ideally occurring as part of the connection logic at one location in the code.
You might further notice that this pattern is really what object-oriented programming is supposed to solve. There's no mention of the language in the question, but a well-designed object model could be created in a similar fashion so that all database access is done through a unified interface. The implementation details for different schemas are internal to different objects that derive (i.e. implement interfaces and/or inherit from base class) from a basic set of interfaces. Consider the language you're using to see if the problem could be solved this way.

How to retrieve an entity using a property from datastore

Is it possible to retrieve an entity from gae datastore using a property and not using the key?
I could see I can retrieve entities with key using the below syntax.
quote = mgr.getObjectById(Students.class, id);
Is there an alternative that enables us to use a property instead of key?
Or please suggest any other ways to achieve the requirement.
Thanks,
Karthick.
Of course this is possible. Think of the key of an entity being like the primary key of an SQL row (but please, don't stretch the analogy too far - the point is it's a primary key - the implementations of these two data storage systems are very different and it causes people trouble when they don't keep this in mind).
You should look either here (JDO) to read about JDO queries or here (JPA) to read about JPA queries, depending what kind of mgr your post refers to. For JDO, you would do something like this:
// begin building a new query on the Cat-kind entities (given a properly annotated
// entity model class "Cat" somewhere in your code)
Query q = pm.newQuery(Cat.class);
// set filter on species property to == param
q.setFilter("species == speciesParam");
// set ordering for query results by age property descending
q.setOrdering("age desc");
// declare the parameters for this query (format is "<Type> <name>")
// as referenced above in filter statement
q.declareParameters("String speciesParam");
// run the query
List<Cat> results = (List<Cat>) q.execute ("siamese");
For JPA, you would use JPQL strings to run your queries.

Optimizing doctrine performance: select * is a bad idea?

We are trying to optimize a project that is consumig a lot of memory resources. All of our query is done using this kind of sintaxes:
$qb->select(array('e'))
->from('MyBundle:Event', 'e');
This is converted in a query selecting every field of the table, like this:
SELECT t0.id AS id1,
t0.field1 AS field12,
t0.field2 AS field23,
t0.field3 AS field34,
t0.field4 AS field45,
FROM event t0
It's a good ideia for performance to use Partial Object Syntax for hydrating only some predefined fields? I really don't know if it will affect performance and I will have a lot of disadvantages because other fields will be null. What do you use to do in your select queries with Doctrine?
Regards.
My two cents
I suppose that hydration (Object Hydration and lazy loading, of course) is good until you don't know how many and what fields to pull from DB tables and put into objects. If you know that you have to retrieve all fields, is better to get them once and work with them, instead of do every time a query that is time-consuming.
However, as a good practice, when I have retrieved and used my objects I unset them explicitly (not if they are last instructions of my function that will return and implicitly unset them)
Update
$my_obj_repo = $this->getDoctrine()->getManager()->getRepository('MyBundleName:my_obj');
$my_obj = $my_obj_repo->fooHydrateFunction(12); //here I don't pull out from db all data
//do stuff with this object, like extracting data or manipulating data
if($my_obj->getBarField() == null) //this is the only field I've load with fooHydrateFunction, so no lazy loading
{
$my_obj->setBarField($request->query->get('bar');
$entity_manager->persist($my_obj);
$entity_manager->flush();
}
//here my object isn't necessary anymore
unset($my_obj); //this is a PHP instruction
//continue with my business logic

Entity Framework 5 .Include() is not loading records if there are related entities that have been deleted

I have a database with a 1..many relationship between two tables, call them Color and Car. A Color is associated 1..many with Cars. In my case, it's critical that Colors can be deleted any time. No cascade delete, so if a Color is deleted, the Car's Color_ID field points to something that doesn't exist. This is OK. They are related via a FK named Color_ID.
The problem comes in when I do this:
var query = context.Cars.Include(x => x.Colors);
This only returns Cars that have an associated Color record that exists. What I really want is ALL the Cars, even if their color doesn't exist, so I can do model binding with a GridView, i.e.
<asp:Label runat="server" Text='<%# Item.Colors == null ? "Color Deleted!" : Item.Colors %>' />
All of this works fine if I remove the .Include() and resort to lazy loading. Then Item.Car.Color is null. Perfect. However I'm seriously concerned about doing way too many database queries for a massive result set, which is certainly possible.
One solution to avoid excessive db queries is to return an anonymous type from the datasource query with all the specific related bits of info that I need for the grid and convert all my "Item" style bindings to good 'ol Eval(). But then I lose the strong typing, and the simplicity that Value Provider attributes bring. I'd hate to re-write all that.
Am I right, do I have to choose one or the other? How can I shape my query to return all the Car records, even if there is no Color record? I think I'm screwed if I try to eager load with .Include(). I need like a .IncludeWithNulls() or something.
UPDATE: Just thought of this. I don't know how ugly this is as far as query cost, but it works. Is there a better way??
var query = context.Cars.Include(x => x.Colors);
var query2 = context.Cars.Where(x => !context.Colors.Any(y => y.Color_ID == x.Color_ID);
return query.Union(query2);
The problem was an incorrect end multiplicity. What I really needed was not 1..many but 0..many. That way, Entity Framework generates a left outer join instead of an inner join from the .Include(). Which makes sense, there may be zero actual Color records in the example above. The thing that confused me was that in the SQL database, I never set those foreign key fields to nullable because at the time of creation, they always required a valid foreign key. So I set them to nullable and fixed up my .edmx table and everything is working. I did have to add a few more null checks here and there such as the one in my question above, that weren't strictly necessary before, since the .Include is now pulling in records that reference missing related entities, but no big deal.
So I lose out on the non-null checking at the db level, but I gain some consistent logic in my LINQ queries for how those tables actually relate and what I expect to get back.

Database schema advice for storing form fields and field values

I've been tasked with creating an application that allows users the ability to enter data into a web form that will be saved and then eventually used to populate pdf form fields.
I'm having trouble trying to think of a good way to store the field values in a database as the forms will be dynamic (based on pdf fields).
In the app itself I will pass data around in a hash table (fieldname, fieldvalue) but I don't know the best way to convert the hash to db values.
I'm using MS SQL server 2000 and asp.net webforms. Has anyone worked on something similar?
Have you considered using a document database here? This is just the sort of problem they solve alot better than traditional RDBMS solutions. Personally, I'm a big fan of RavenDb. Another pretty decent option is CouchDb. I'd avoid MongoDb as it really isn't a safe place for data in it's current implementation.
Even if you can't use a document database, you can make SQL pretend to be one by setting up your tables to have some metadata in traditional columns with a payload field that is serialized XML or json. This will let you search on metadata while staying out of EAV-land. EAV-land is a horrible place to be.
UPDATE
I'm not sure if a good guide exists, but the concept is pretty simple. The basic idea is to break out the parts you want to query on into "normal" columns in a table -- this lets you query in standard manners. When you find the record(s) you want, you can then grab the CLOB and deserialize it as appropriate. In your case you would have a table that looked something like:
SurveyAnswers
Id INT IDENTITY
FormId INT
SubmittedBy VARCHAR(255)
SubmittedAt DATETIME
FormData TEXT
A few protips:
a) use a text based serialization routine. Gives you a fighting chance to fix data errors and really helps debugging.
b) For SQL 2000, you might want to consider breaking the CLOB (TEXT field holding your payload data) into a separate table. Its been a long time since I used SQL 2000, but my recollection is using TEXT columns did bad things to tables.
The solution for what you're describing is called Entity Attribute Value (EAV) and this model can be a royal pain to deal with. So you should limit as much as possible your usage of this.
For example are there fields that are almost always in the forms (First Name, Last Name, Email etc) then you should put them in a table as fields.
The reason for this is because if you don't somebody sooner or later is going to realize that they have these names and emails and ask you to build this query
SELECT
Fname.value fname,
LName.Value lname,
email.Value email,
....
FROM
form f
INNER JOIN formFields fname
ON f.FormId = ff.FormID
and AttributeName = 'fname'
INNER JOIN formFields lname
ON f.FormId = ff.FormID
and AttributeName = 'lname'
INNER JOIN formFields email
ON f.FormId = ff.FormID
and AttributeName = 'email'
....
when you could have written this
SELECT
common.fname,
common.lname,
common.email,
....
FROM
form f
INNER JOIN common c
on f.FormId = c.FormId
Also get off of SQL 2000 as soon as you can because you're going to really miss the UNPIVOT clause
Its also probably not a bad idea to look at previous SO EAV questions to give you an idea of problems that people have encountered in the past
I'd suggest mirroring the same structure:
Form
-----
form_id
User
created
FormField
-------
formField_id
form_id
name
value

Resources