I am studying https://www.doctrine-project.org/projects/doctrine-orm/en/2.6/reference/working-with-associations.html but I cannot figure out what cascade merge does. I have seen elsewhere that
$new_object = $em->merge($object);
basically creates a new managed object based on $object. Is that correct?
$em->merge() is used to take an Entity which has been taken out of the context of the entity manager and 'reattach it'.
If the Entity was never managed, merge is equivalent to persist.
If the Entity was detached, or serialized (put in a cache perhaps) then merge more or less looks up the id of the entity in the data store and then starts tracking any changes to the entity from that point on.
Cascading a merge extends this behavior to associated entities of the one you are merging. This means that changes are cascaded to the associations and not just the entity being merged.
I know this is an old question, but I think it is worth mentioning that $em->merge() is deprecated and will be removed soon. Check here
Merge operation is deprecated and will be removed in Persistence 2.0.
Merging should be part of the business domain of an application rather
than a generic operation of ObjectManager.
Also please read this doc v3 how they expect entities to be stored
https://www.doctrine-project.org/projects/doctrine-orm/en/latest/cookbook/entities-in-session.html#entities-in-the-session
It is a good idea to avoid storing entities in serialized formats such
as $_SESSION: instead, store the entity identifiers or raw data.
Related
I have User entity group and Transaction entity under that. I autoallocate ids for the transactions. I want to create unique key for integration with the payment service. Since the transaction is not a root entity, the autoallocated ids are not guaranteed to be unique and hence I can't use them as the unique keys. What I am currently doing following the suggestion at
Google Cloud Datastore unique autogenerated ids
is to have a dummy root entity and allocate ids for it and store that id with the transaction entity as a separate field. However, since it is dummy, I am currently not writing the dummy entity itself to the datastore.
I have read other posts
When using allocateIds(), how long do unused IDs remain allocated?
and
Are deleted entity IDs once again available to App Engine if auto-generated for an Entity?
but am still not sure. Do I have to insert this dummy entity with just the key? If not, how are all the allocated ids for this dummy entity tracked and what happens to the corresponding storage usage?
An entity key ID, together with the kind and ancestor (and maybe namespace as well) define a unique entity key, which is meaningful even if the entity doesn't actually exist: it's possible to have child entities in an entity group ancestry attached to an ancestor which doesn't exist. From Ancestor paths (emphasis mine):
When you create an entity, you can optionally designate another entity
as its parent; the new entity is a child of the parent entity (note
that unlike in a file system, the parent entity need not actually
exist).
So the fact that your dummy entities actually exist or not should not matter: their key IDs pre-allocated using allocateIds() should never expire. From Assigning identifiers:
Datastore mode's automatic ID generator will keep track of IDs that
have been allocated with these methods and will avoid reusing them for
another entity, so you can safely use such IDs without conflict. You
can not manually choose which values are returned by the
allocateIds() method. The values returned by allocateIds() are
assigned by Datastore mode.
Personal considerations supporting this opinion:
the datastore has no limit of the number of entities of the same kind, ancestor and namespace, so it should support a virtually unlimited number of unique IDs. IMHO this means that there should be no need to even consider re-using them. Which is probably why there is no mention about any deadline or expiry time for allocated IDs.
if IDs for deleted entities would ever be reused it would raise a significant problem for restoring datastore entities from backups - potentially overwriting newer entities with re-used IDs with the entities that used the same IDs previously
I have a Core Data model with something like 20 entities. I want all entities to have common attributes. For example, all of them have a creation date attribute.
I therefore introduced an common entity containing all the common attributes, and all the other entities inherit from this common entity.
This is fine and works well, but then, all entities end up in one single SQLite table (which is rather logical).
I was wondering if there was any clear drawback to this ?
For example, when going in real life with 1000+ objects of each entity, would the (single) table become so huge that terrible performance problems could happen ?
This question has been asked before:
Core Data entity inheritance --> limitations?
Core data performances: when all entities inherit from the same parent entity
Core Data inheritance vs no inheritance
Also keep in mind that when you want to check the SQLite file for debugging purposes, seperate tables are easier to examine.
I would use a common NSManagedObject subclass instead of a parent entity.
Don't worry about this. From Core Data documentation:
https://developer.apple.com/library/tvos/documentation/Cocoa/Conceptual/CoreData/Performance.html
... The SQLite store can scale to terabyte-sized databases with billions of rows, tables, and columns. Unless your entities themselves have very large attributes or large numbers of properties, 10,000 objects is considered a fairly small size for a data set.
What is way more important is that if you are doing any heavy operations, like fetching a lot of objects, or parsing objects based on some JSON from a webservice, you do this not on the mainthread. This is not very hard to do, look into parent/child managedobjectcontexts and how they can be used with managedcontextobjects with a private / main queue concurrencytype. Many good blog posts about this subject exist all over the interwebs.
I've been working on a project with one base entity for around 20 subentities and easily overall 50k instances for over 2 years now. We've never had performance problems with selects, inserts or updates.
The keys to using Core Data inheritance with large data sets are
optimized fetch requests (tune predicate, exclude irrelevant properties, prefetch relationships, omit subentities, set fetchLimit, use dictionary result type or count-requests if sufficient etc.)
batch saves (meaning not saving the MOC after every insert etc.)
setting up proper indices (they can speed up selects a looot)
structuring your UI appropriately so you won't have to load and display many thousand objects in one viewController
We do not even use parent/child managedObjectContexts or private queues (which introduce a lot of extra complexity on their own) when importing JSON, as our data model and mapping code is so highly optimized, that the UI doesn't even flicker or hang considerably when importing a few thousand objects.
I am writing a vocabulary learning application.
I have a Wordset Entity.
I want it to contain a property - WordsToLearn (a collection of words to learn for a CURRENT user, words which are either new, i.e no repetitions for current user or have Repetition due today or earlier)
How can I implement this?
Without this my object seems very incomplete.
Are entities limited to simple relationships and I should forget about it in this place and move it to Wordset Repository.
I would be very useful to be able to get that information (wordsToLearn) from Wordset Object
Yes, entities are limited to these simple relationships. For more complex queries you have to use a WordsetRepository that you can pass your current_user object to and use that to get the desired entities in your controllers. You can use Doctrine's DQL to fetch 'real' entity objects instead of just SQL results.
I have seen a particular pattern a few times over the last few years. Please let me describe it.
In the UI, each new record (e.g., new customers details) is stored on the form without saving to database. This clearly has been done so not clutter the database or cause unnecessary database hits.
While in the UI state, these objects are identified using a Guid. When these are a saved to the database, their associated Guids are not stored. Instead, they are assigned a database Int as their primary key.
The form can cope with a mixure of retrieved items from the database (using Int) as well as those that have not yet been committed (using Guid).
When inspecting the form (using Firebug) to see which key was used, we found a two part delimited combined key had been used. The first part is a guid (an empty guid if drawn from the database) and the second part is the integer (zero is stored if it is not drawn from the database). As one part of the combined key will always uniquely identify a record, it works rather well.
Is this Good practice or not? Can anyone tell me the pattern name or suggest one if it is not already named?
There are a couple patterns at play here.
Identity Field Pattern
Defined in P of EAA as "Saves a database ID field in an object to maintain identity between an in-memory object and a database row." This part is obvious.
Transaction Script and Metadata Mapping
In general, the ASP.NET DataBound controls use something like an Transaction Script pattern in conjunction with a Metadata Mapping pattern. Fowler defines Metadata Mapping as "holding details of object-relational mapping in metadata". If you have ever written a data source control, the Metadata Mapping aspect of this pattern seems obvious.
The Transaction Script pattern "organizes business logic by procedures where each procedure handles a single request from the presentation." In order to encapsulate the logic of maintaining both presentation state and data-state it is necessary for the intermediary object to indicate:
If a database record exists
How to identify the backend data record, to populate the UI control
How to identify the data and the UI control if there is no current data record, so that presentation data can be updated from the backend datastore.
The presence of the new client data entry Guid and the data-record integer Id provide adequate information to determine all of this with only a single call to the database. This could be accomplished by just using integers (and perhaps giving a unique negative integer for each unpersisted UI data item), but it is probably more explicit to have two separate fields.
Good or Bad Practice?
It depends. ASP.NET is a pretty successful software project, and this pattern seems to work consistently. However, this type of ASP.NET web control has a very specific scope of application - to encapsulate interaction between a UI and a database about data objects with simple mappings. The concerns do seem a little blurred, but for many applicable scenarios this will still be acceptable. The pattern is valid whereever a Row Data Gateway would be acceptable. If there is more than one database row affected by a web control, then this approach will not be functional. In these more complex cases, either an Active Record implementation or the combination of a Domain Model and a Repository implementation would be better suited.
Whether a pattern is good or bad practice really depends on the scenario in which it is being applied. It seems like people tend to advocate more complex design structures, because they can be applied to more scenarios without failing. However, in a very simple application where the mappings between data records and the UI are direct, this pattern is very useful because it creates the intended result while minimizing the amount of performance and development overhead.
I don't think there is a specific pattern for that.
Is it good practice? I don't think so. First, it's not very object oriented. How about:
interface ICommittable
{
/// <summary>
/// Gets or sets a value indicating whether the entity was already committed to the database.
/// </summary>
bool IsCommitted;
/// <summary>
/// Gets or sets the ID of the entity, used either in database or generated by UI or an underlying BL.
/// </summary>
Guid Id;
}
Instead, what they do is to mix three separate data entries in one in a non obvious way:
The ID
Another ID (why?)
A fact that the entity was committed or not.
Especially, having two separate IDs is extremely confusing and will require not only a good documentation, but some time for a new developer to understand what's happening here.
If the purpose was to create new entities without querying the database for a new ID, they could use GUIDs everywhere: when a new entity is created, you Guid.CreateNew it's ID, then, if need, you commit everything, this GUID being the identifier in the database too (there are few chances to have a collision between already saved GUIDs and a new one, so I wouldn't care about that).
Much more simple, isn't it?
It's also not easy to do a few things. For example, how do you compare two entities? Remember that:
Two committed entities which have different GUIDs are not equal,
Two not committed entities which have different IDs are not equal,
A committed entity may be equal to a non committed entity, even if their GUIDs and their IDs will be different.
To conclude, it seems like a lack of refactoring. Probably they were modifying a project where entities were already identified in the database by their id (int) unique key, so instead of refactoring this, they just added GUIDs, thus making the overall thing:
More difficult to understand,
Very difficult to work with and to modify in future.
If I'm not wrong it's the repository pattern: http://martinfowler.com/eaaCatalog/repository.html
This is well described in the Evans Domain Driven Design book and has proven to work well under specific circumstances.
What's the recommended way of updating an entity? So far, I figured out two ways:
Just create a new entity with the existing Id and updated property values, and use session.SaveOrUpdate()
Use a DTO, retrieve the existing entity using session.Load(dto.Id), assign new vaues from the dto, then save.
No1 requires much less effort, but sometimes I'm getting an exception: "a different object with the same identifier value was already associated with the session". Is there a simple way around that?
No2 might require an extra trip to the DB I guess?
Sorry if that's been answered already, just couldn't find the answer.
Thanks
ulu
Your second option with DTOs is my preferred way. Your DTOs should be specific to the screen (Google Screen Bound DTOs) so that the screen and your domain can change independently of one another.
It also won't add an extra trip to the db since #1 would require a disconnected entity which would have to be reconnected (which triggers a select) after the fact. Worrying about one extra select also smells strongly of premature optimization.
In terms of converting from domain to DTO I'd recommend looking at AutoMapper.
To use No1 you could try and Evict the object from nHibernates session. This will get rid of the error about the object already being in the session.
I would recommend approach number 2. Especially if you want to add any sort of optomistic locking. In many cases a single extra hit to the db won't be that expensive.
Edit
To check if a entity already exists in the session you can use the Contains(obj) method on the Session instance.