java DynamoDBMapper - partially mapped entities - amount of Read Capacity Units - amazon-dynamodb

Does java DynamoDB load whole Items when the #DynamoDBTable annotated class maps only a subset of their attributes?
example: "Product" table, holding items with these attributes:
id, name, description. I would like to get the names of several products, without loading the description (which would be a huge amount of data).
Does this code load description from DynamoDB?
#DynamoDBTable(tableName = "Product")
public class ProductName {
private UUID id;
private String name;
#DynamoDBHashKey
#DynamoDBTyped(DynamoDBAttributeType.S)
public UUID getId() { return id; }
public void setId(UUID id) { this.id = id; }
#DynamoDBAttribute
public String getName() { return name; }
public void setName(String name) { this.name = name; }
}
...
DynamoDBMapper dynamoDBMapper = ...
dynamoDBMapper.batchLoad(products); // TODO is description loaded? what is the amount of Consumed Read Capacity Units?

As their docs say:
DynamoDB calculates the number of read capacity units consumed based on item size, not on the amount of data that is returned to an application. For this reason, the number of capacity units consumed will be the same whether you request all of the attributes (the default behavior) or just some of them (using a projection expression). The number will also be the same whether or not you use a filter expression.
As you see, projections does not impact on the amount of capacity units used.
BTW, in your case, description field will be returned anyway, because you do not need to annotate every field with DynamoDB annotation, only those, who are keys, or named differently, or need custom converters. All non-annotated fields will populated from the corresponding DB fields automatically.

Related

Unwanted unique constraint in many to many relationship

I'm trying to set up a Tagging tool for images. Basically I have two tables, one for pictures, and one for tags. Both are connected with a many to many setup. I can already add a single tag to a picture, and the same tag to different pictures. However, when I try to add a second tag to an image I get an exception complaining about a unique constraint that I simply don't see.
public class MediaEntity
{
public Guid Id { get; set; }
public string Name { get; set; }
public ICollection<TagEntity> Tags { get; set; }
}
public class TagEntity
{
public Guid Id { get; set; }
public string Name { get; set; }
public ICollection<MediaEntity> MediaEntities { get; set; }
}
public void updateMedia(MediaEntity model)
{
using (var db = new MediaContext(_dbLocation))
{
db.Update(model);
db.SaveChanges();
}
}
public class MediaContext : DbContext
{
private const string DB_NAME = "PT.db";
private string _path;
public DbSet<MediaEntity> MediaTable { get; set; }
public DbSet<TagEntity> TagTable { get; set; }
public MediaContext(string path)
{
_path = path;
ChangeTracker.AutoDetectChangesEnabled = false;
}
protected override void OnConfiguring(DbContextOptionsBuilder options)
=> options.UseSqlite($"Data Source={Path.Combine(_path, DB_NAME )}");
}
As far as I can tell my setup should create a normal many-to-many relationship, and it the database I also see pretty much this. EF automatically creates a TagTable, MediaTable, and MediaEntityTagEntityTable. But when I try to add a second tag I get this:
SqliteException: SQLite Error 19: 'UNIQUE constraint failed:
MediaEntityTagEntity.MediaEntitiesId, MediaEntityTagEntity.TagsId'.
Data from the table showing I can have the same tag on different pictures:
MediaEntitiesId
TagEntitiesId
1B48E85B-F097-4216-9B7A-0BA34E69CBFF
CF581257-F176-4CDF-BF34-09013DCEAA27
CE33F03F-5C80-492B-88C6-3C40B9BADC6C
CF581257-F176-4CDF-BF34-09013DCEAA27
523178A1-C7F8-4A69-9578-6A599C1BEBD5
0C45C9D1-7576-4C62-A495-F5EF268E9DF8
I don't see where this unique constaint comes in. How can I set up a proper many-to-many relationship?
I suspect the issue you may be running into is with the detached Media and associated Tags you are sending in. You are telling EF to apply an 'Update' to the media, but the DbContext will have no idea about the state of the Tags attached. Assuming some tags may have been newly attached, others are existing relationships. If the Context isn't tracking any of these Tags, it would treat them all as inserts, resulting in index violations (many to many) or duplicate data (many to one / one to many)
When dealing with associations like this, it is generally simpler to define more atomic actions like: AddTag(mediaId, tagId) and RemoveTag(mediaId, tagId)
If you are applying tag changes along with potential media field updates in a single operation I would recommend rather than passing entire entity graphs back and forth, to use a viewModel/DTO for the tag containing a collection of TagIds, from that apply your tag changes against the media server side after determining which tags have been added and removed.
I.e.:
public void updateMedia(MediaViewModel model)
{
using (var db = new MediaContext(_dbLocation))
{
var media = db.Medias.Include(x => x.Tags).Single(x => x.MediaId = model.MedialId);
// Ideally have a Timestamp/row version number to check...
if (media.RowVersion != model.RowVersion)
throw new StaleDataException("The media has been modified since the data was retrieved.");
// copy media fields across...
media.Name = model.Name;
// ... etc.
var existingTagIds = media.Tags
.Select(x => x.TagId)
.ToList();
var tagIdsToRemove = existingTagIds
.Except(model.TagIds)
.ToList();
var tagIdsToAdd = model.TagIds
.Except(existingTagIds)
.ToList();
if(tagIdsToRemove.Any())
media.Tags.RemoveRange(media.Tags.Where(x => tagIdsToRemove.Contains(x.TagId));
if(tagIdsToAdd.Any())
{
var tagsToAdd = db.Tags.Where(x => tagIdsToAdd.Contains(x.TagId)).ToList();
media.Tags.AddRange(tagsToAdd);
}
db.SaveChanges();
}
}
Using this approach the DbContext is never left guessing about the state of the media and associated tags. It helps guard against stale data overwrites and unintentional data tampering (if receiving data from web browsers or other unverifiable sources), and by using view models with the minimum required data, you improve performance by minimzing the amount of data sent over the wire and traps like lazy load hits by serializers.
I always explicitly create the join table. The Primary Key is the combination of the two 1:M FK attributes. I know EF is supposed to map automatically, but since it isn't, you can specify the structure you know you need.

How to use transactional DatastoreIO

I’m using DatastoreIO from my streaming Dataflow pipeline and getting an error when writing an entity with the same key.
2016-12-10T22:51:04.385Z: Error: (af00222cfd901860): Exception: com.google.datastore.v1.client.DatastoreException: A non-transactional commit may not contain multiple mutations affecting the same entity., code=INVALID_ARGUMENT
If I use a random number in the key then things work but I need to update the same key so is there a transactional way to do this using DataStoreIO?
static class CreateEntityFn extends DoFn<KV<String, Tile>, Entity> {
private static final long serialVersionUID = 0;
private final String namespace;
private final String kind;
CreateEntityFn(String namespace, String kind) {
this.namespace = namespace;
this.kind = kind;
}
public Entity makeEntity(String key, Tile tile) {
Entity.Builder entityBuilder = Entity.newBuilder();
Key.Builder keyBuilder = makeKey(kind, key );
if (namespace != null) {
keyBuilder.getPartitionIdBuilder().setNamespaceId(namespace);
}
entityBuilder.setKey(keyBuilder.build());
entityBuilder.getMutableProperties().put("tile", makeValue(tile.toString()).build());
return entityBuilder.build();
}
#Override
public void processElement(ProcessContext c) {
String key = c.element().getKey();
// this works key = key.concat(":" + UUID.randomUUID().toString());
c.output(makeEntity(key, c.element().getValue()));
}
}
...
...
inputData = pipeline
.apply(PubsubIO.Read.topic(pubsubTopic));
windowedDataStreaming = inputData
.apply(Window.<String>into(
SlidingWindows.of(Duration.standardMinutes(15))
.every(Duration.standardSeconds(31))));
...
...
...
//Create a Datastore entity
PCollection<Entity> siteTileEntities = tileSiteKeyed
.apply(ParDo.named("CreateSiteEntities").of(new CreateEntityFn(options.getNamespace(), options.getKind())));
// write site tiles to datastore
siteTileEntities
.apply(DatastoreIO.v1().write().withProjectId(options.getDataset()));
// Run the pipeline
pipeline.run();
Your code snippet doesn't explain how tileSiteKeyed is created. Presumably it's a PCollection<KV<String, Tile>, but if it might have duplicate String keys, that would explain the issue.
Generally a PCollection<KV<K, V>> may contain multiple KV pairs with the same key. If you'd like to ensure unique keys per window, you can use a GroupByKey to do that. That will give you a PCollection<KV<K, Iterable<V>>> with unique keys per window. Then augment CreateEntityFn to take an Iterable<Tile> and create a single mutation with the changes you need to make.
This error indicates that Cloud Datastore received a Commit request with two mutations for the same key (i.e. it tries to insert the same entity twice or modify the same entity twice).
You can avoid the error by only including one mutation per key per Commit request.

How to design a table in dynamodb for Carts

I have a use case where in a person can have no. of carts. This is possible because 1) user can complete an order with a cart and after that cart is considered closed 2) user can leave a cart for 2 months and it is considered expired. If user adds new item on 2 months old cart, old cart is marked expired and new cart is generated.
I tried designing following table:
Table Name: Cart
Primary Hash Key: cartId (String)
Primary Range Key: updated (String)
I am using updated as a range column so that when I query it I can get all the carts sorted on when user updated those and I can pick the first (or last) one without sorting myself to have the most recent cart. However this is messing up my use cases.
If a user adds another item, I update the item in the cart and update the updated column as well. However this creates another cart for me (with same cart id but with new updated column). After re-reading the docs, I understand that primary key is composite of cartId and updated so probably I should remove it. However I believe my use case is genuine enough and it is bad that in my case I have to do sorting in application. Another way around would be to use an auto increment as range but that is non intuitive way of putting columns. If there is a work around pls let me know. I am using DynamoDBMapper and posting my classes (with only few fields).
import java.util.Set;
import com.amazonaws.services.dynamodbv2.datamodeling.*;
#DynamoDBTable(tableName="Cart")
public class Cart {
private String cartId;
private String email;
private Set<String> cartItemsJson;
private String status;
private String created;
private String updated;
#DynamoDBHashKey(attributeName="cartId")
public String getCartId() {
return cartId;
}
public void setCartId(String cartId) {
this.cartId = cartId;
}
#DynamoDBAttribute(attributeName="email")
public String getEmail() {
return email;
}
public void setEmail(String email) {
this.email = email;
}
#DynamoDBAttribute(attributeName="cartItems")
public Set<String> getCartItemsJson() {
return cartItemsJson;
}
public void setCartItemsJson(Set<String> cartItemsJson) {
this.cartItemsJson = cartItemsJson;
}
#DynamoDBAttribute(attributeName="status")
public String getStatus() {
return status;
}
public void setStatus(String status) {
this.status = status;
}
#DynamoDBAttribute(attributeName="created")
public String getCreated() {
return created;
}
public void setCreated(String created) {
this.created = created;
}
#DynamoDBRangeKey(attributeName="updated")
#DynamoDBVersionAttribute(attributeName="updated")
public String getUpdated() {
return updated;
}
public void setUpdated(String updated) {
this.updated = updated;
}
}
This the persistence layer code. I have tried various combinations of Save behaviour but still same results.
protected static DynamoDBMapper mapper = new DynamoDBMapper(dynamoDbClient);
mapper.save(cart,new DynamoDBMapperConfig(DynamoDBMapperConfig.SaveBehavior.UPDATE));
In DynamoDB you can not update the Hash or Range key. Updating them means deleting and create a new entry.
I know that you can create a Secondary index. Maybe this will help.
Also I think you can overcome the updated as range key. You can create the table as follows:
HashKey = userId
RangeKey = cardId ( but cardId needs to be sortable for each user )
normal column = updated
normal column = etc..
When you need the last cardId of a specific user, you can get top 1 rows for a hashkey=your user and reverse sorted so you get the last one first.
When you need to add an item to the card, you don't need to update the hash/range keys.
Hope it helps.

Collections in QueryDSL projections

I am trying to use a projection to pull in data from an entity an some relations it has. However. The constructor on the projection takes three arguments; a set, integer and another integer. This all works fine if I don't have the set in there as an argument, but as soon as I add the set, I start getting SQL syntax query errors.
Here is an example of what I'm working with...
#Entity
public class Resource {
private Long id;
private String name;
private String path;
#ManyToOne
#JoinColumn(name = "FK_RENDITION_ID")
private Rendition rendition;
}
#Entity
public class Document {
private Long id;
private Integer pageCount;
private String code;
}
#Entity
public class Rendition {
Long id;
#ManyToOne
#JoinColumn(name="FK_DOCUMENT_ID")
Document doc;
#OneToMany(mappedBy="rendition")
Set<Resource> resources;
}
public class Projection {
#QueryProjection
public Projection(Set<Resource> resources, Integer pageCount, String code) {
}
}
Here is the query like what I am using (not exactly the same as this is a simplified version of what I'm dealing with)....
QRendition rendition = QRendition.rendition;
Projection projection = from(rendition)
.where(rendition.document().id.eq(documentId)
.and(rendition.resources.isNotEmpty())
.limit(1)
.singleResult(
new QProjection(rendition.resources,
rendition.document().pageCount,
rendition.document().code));
This query works fine as long as my projection class does not have the rendition.resources in it. If I try and add that in, I start getting malformed SQL errors (it changes the output sql so that it starts with this.
select . as col_0_0_
So, I guess my main question here is how do I include a Set as an object in a projection? Is it possible, or am I just doing something wrong here?
Using collections in projections is unreliable in JPA. It is safer to join the collection and aggregate the results instead.
Querydsl can also be used for result aggregation http://www.querydsl.com/static/querydsl/3.2.0/reference/html/ch03s02.html#d0e1799
In your case something like this
QRendition rendition = QRendition.rendition;
Projection projection = from(rendition)
.innerJoin(rendition.document, document)
.innerJoin(rendition.resources, resource)
.where(document.id.eq(documentId))
.limit(1)
.transform(
groupBy(document.id).as(
new QProjection(set(resources),
document.pageCount,
document.code)));

Adding/updating child and parent record same time

Can someone please show me the easiest way to create/update a parent and child record at the same time (like customer with multiple addresses) with least or no code as possible? Both Web Forms and in MVC.
The basic idea would be to create/update the parent record and return the new ID (key). Then use that key to create the related child records. For example, say you have an Events table and a related EventDates table:
public static int CreateEvent(
out int eventId,
DateTime datePosted,
string title,
string venue,
string street1,
string city,
string state,
string zipCode)
{
...
}
public static void AddEventDates(
int eventDateID,
int eventID,
DateTime startDate,
DateTime endDate)
{
...
}
It's important to maintain data integrity here; if one of the updates fails then both need to be returned to the original state. You could implement this yourself or use transactions:
http://msdn.microsoft.com/en-us/library/z80z94hz%28VS.90%29.aspx

Resources