How to solve Google Datastore n+1 query issue? - google-cloud-datastore

Lets say I have a parent/child relationship models with 2 google datastore kinds.
If I query the child table using the parent key, is there any way of also retrieving the fields from the parent table without having to do n+1 queries ?
RequestLedger
key | type | content | sentTimestamp
123 email <ssd> 10-10-10
ResultLedger
key | requestLedger |to | deliveredTimestamp | bouncedTimestamp | other
123-xxx#xxx.com key(request_ledger,123) xxx#xxx.com 10-10-10
code
var query = ds.createQuery(env.get('GCLOUD_DATASTORE_NAMESPACE'), resultLedgerKind)
.offset(offset)
.limit(max);
if(parentId){
query = query.filter('requestLedger', ds.key([requestLedgerKind, parentId]));
}
query.run(function(err, entities) {
callback(err, entities );
});
Query above gives me data from the child table and a reference to the parent kind entity but only the key field of the parent. Any easy eay to get everything back in the query at same time?

You can use the lookup method to get multiple entities by their keys in one request. The rest example is here I'm convinced the javascript implementation has a similar implementation.
Note that since api v1 there is a limit of 1000 keys you can get in a single request. This wasn't so in the beta version.
This should bring back your request count if you first get the children and then lookup their respective parent keys.

Related

DynamoDB Global Secondary Index "Batch" Retrieval

I've see older posts around this but hoping to bring this topic up again. I have a table in DynamoDB that has a UUID for the primary key and I created a secondary global index (SGI) for a more business-friendly key. For example:
| account_id | email | first_name | last_name |
|------------ |---------------- |----------- |---------- |
| 4f9cb231... | linda#gmail.com | Linda | James |
| a0302e59... | bruce#gmail.com | Bruce | Thomas |
| 3e0c1dde... | harry#gmail.com | Harry | Styles |
If account_id is my primary key and email is my SGI, how do I query the table to get accounts with email in ('linda#gmail.com', 'harry#gmail.com')? I looked at the IN conditional expression but it doesn't appear to work with SGI. I'm using the go SDK v2 library but will take any guidance. Thanks.
Short answer, you can't.
DDB is designed to return a single item, via GetItem(), or a set of related items, via Query(). Related meaning that you're using a composite primary key (hash key & sort key) and the related items all have the same hash key (aka partition key).
Another way to think of it, you can't Query() a DDB Table/index. You can only Query() a specific partition in a table or index.
Scan() is the only operation that works across partitions in one shot. But scanning is very inefficient and costly since it reads the entire table every time.
You'll need to issue a GetItem() for every email you want returned.
Luckily, DDB now offers BatchGetItem() with will allow you to send multiple, up to 100, GetItem() requests in a single call. Saves a little bit of network time and automatically runs the requests in parallel; but otherwise is the little different from what your application could do itself directly with GetItem(). Make no mistake, BatchGetItem() is making individual GetItem() requests behind the scenes. In fact, the requests in a BatchGetItem() don't even have to be against the same tables/indexes. The cost for each request in a batch will be the same as if you'd used GetItem() directly.
One difference to make note of, BatchGetItem() can only return 16MB of data. So if your DDB items are large, you may not get as many returned as your requested.
For example, if you ask to retrieve 100 items, but each individual
item is 300 KB in size, the system returns 52 items (so as not to
exceed the 16 MB limit). It also returns an appropriate
UnprocessedKeys value so you can get the next page of results. If
desired, your application can include its own logic to assemble the
pages of results into one dataset.
Because you have a GSI with PK of email (from what I understand) you can use PartiQL command to get your batch of emails back. The API is called ExecuteStatment and you use a SQL like syntax:
SELECT * FROM mytable.myindex WHERE email IN ['email#email.com','email1#email.com']

DynamoDB sub item filter using .Net Core API

First of all, I have table structure like this,
Users:{
UserId
Name
Email
SubTable1:[{
Column-111
Column-112
},
{
Column-121
Column-122
}]
SubTable2:[{
Column-211
Column-212
},
{
Column-221
Column-222
}]
}
As I am new to DynamoDB, so I have couple of questions regarding this as follows:
1. Can I create structure like this?
2. Can we set primary key for subtables?
3. Luckily, I found DynamoDB helper class to do some operations into my DB.
https://www.gopiportal.in/2018/12/aws-dynamodb-helper-class-c-and-net-core.html
But, don't know how to fetch only perticular subtable
4. Can we fetch only specific columns from my main table? Also need suggestion for subtables
Note: I am using .net core c# language to communicate with DynamoDB.
Can I create structure like this?
Yes
Can we set primary key for subtables?
No, hash key can be set on top level scalar attributes only (String, Number etc.)
Luckily, I found DynamoDB helper class to do some operations into my DB.
https://www.gopiportal.in/2018/12/aws-dynamodb-helper-class-c-and-net-core.html
But, don't know how to fetch only perticular subtable
When you say subtables, I assume that you are referring to Array datatype in the above sample table. In order to fetch the data from DynamoDB table, you need hash key to use Query API. If you don't have hash key, you can use Scan API which scans the entire table. The Scan API is a costly operation.
GSI (Global Secondary Index) can be created to avoid scan operation. However, it can be created on scalar attributes only. GSI can't be created on Array attribute.
Other option is to redesign the table accordingly to match your Query Access Pattern.
Can we fetch only specific columns from my main table? Also need suggestion for subtables
Yes, you can fetch specific columns using ProjectionExpression. This way you get only the required attributes in the result set

How to retrieve an entity using a property from datastore

Is it possible to retrieve an entity from gae datastore using a property and not using the key?
I could see I can retrieve entities with key using the below syntax.
quote = mgr.getObjectById(Students.class, id);
Is there an alternative that enables us to use a property instead of key?
Or please suggest any other ways to achieve the requirement.
Thanks,
Karthick.
Of course this is possible. Think of the key of an entity being like the primary key of an SQL row (but please, don't stretch the analogy too far - the point is it's a primary key - the implementations of these two data storage systems are very different and it causes people trouble when they don't keep this in mind).
You should look either here (JDO) to read about JDO queries or here (JPA) to read about JPA queries, depending what kind of mgr your post refers to. For JDO, you would do something like this:
// begin building a new query on the Cat-kind entities (given a properly annotated
// entity model class "Cat" somewhere in your code)
Query q = pm.newQuery(Cat.class);
// set filter on species property to == param
q.setFilter("species == speciesParam");
// set ordering for query results by age property descending
q.setOrdering("age desc");
// declare the parameters for this query (format is "<Type> <name>")
// as referenced above in filter statement
q.declareParameters("String speciesParam");
// run the query
List<Cat> results = (List<Cat>) q.execute ("siamese");
For JPA, you would use JPQL strings to run your queries.

Doctrine 2.1 - Map entity to multiple tables

I have the following database situation:
wp_users (user table generated by wordpress)
ID | user_login | ...
wp_sp_user (extension to the wp_users table)
ID (FK) | surname | address | ...
Now I've already been trying for hours to "fuse" those two tables into one single User entity, e.g:
class User {
var ID;
var user_login;
var surname;
var address;
...
}
Is there any way to accomplish such a mapping without modifying the wp_user table (which I don't want to do for updating reasons)?
Some times database refactoring is not possible or the table has his own "raison d'ĂȘtre". In this cases you can use inheritance. Your User class can extens Account. Map Account to wp_users and extend it with wp_sp_user table. User class will use columns of the two tables.
Here is the doctrine documentation:
https://www.doctrine-project.org/projects/doctrine-orm/en/current/reference/inheritance-mapping.html
This is not possible. It also doesn't make sense to do so.
You will need to physically merge the tables together in MySQL and create a Doctrine entity for that table. This is the only way you can ensure your data is clean and fully normalized.
Another possible solution is to create one entity for each table and use a business object to combine results from each. This is not a very nice solution at all, as you will have to handle constraints on the application layer, and you will double the amount of queries you launch.

What's the best way to retrieve this data?

The architecture for this scenario is as follows:
I have a table of items and several tables of forms. Rather than having the forms own the items, the items own the forms. This is because one item can be on several forms (although only one of each type, but not necessarily on any). The forms and items are all tied together by a common OrderId. This can be represented like so:
OrderItems | Form A | Form B etc....
---------- |--------- |
ItemId |FormAId |
OrderId |OrderId |
FormAId |SomeField |
FormBId |OtherVar |
FormCId |etc...
This works just fine for these forms. However, there is another form, (say, FormX) which cannot have an OrderId because it consists of items from multiple orders. OrderItems does contain a column for FormXId as well, but I'm confused about the best way to get a list of the "FormX"s related to a single OrderId. I'm using MySQL and was thinking maybe a stored proc was the best way to go on this, but I've never used a stored proc on MySQL and don't really know the best way to go about it. My other (kludgy) option was to hit the DB twice, first to get all the items that are for the given OrderId that also have a FormXId, and then get all their FormXIds and do a dynamic SELECT statement where I do something like (pseudocode)
SELECT whatever FROM sometable WHERE FormXId=x OR FormXId=y....
Obviously this is less than ideal, but I can't really think of any other way... anything better I could do either programmatically or architecturally? My back-end code is ASP.NET.
Thanks so much!
UPDATE
In response to the request for more info:
Sample input:
OrderId = 1000
Sample output
FormXs:
-----------------
FormXId | FieldA | FieldB | etc
-------------------------------
1003 | value | value | ...
1020 | ... .. ..
1234 | .. . .. . . ...
You see the problem is that FormX doesn't have one single OrderId but is rather a collection of OrderIds. Sometimes multiple items from the same order are on FormX, sometimes it's just one, most orders don't have any items on FormX. But when someone pulls up their order, I need for all the FormXs their items belong on to show up so they can be modified/viewed.
I was thinking of maybe creating a stored proc that does what I said above, run one query to pull down all the related OrderIds and then another to return the appropriate FormXs. But there has to be a better way...
I understand you need to get a list of the "FormX"s related to a single OrderId. You say, that OrderItems does contain a column for FormXId.
You can issue the following query:
select
FormX.*
From
OrderItems
join
Formx
on
OrderItems.FormXId = FormX.FormXId
where
OrderItems.OrderId = #orderId
You need to pass #orderId to your query and you will get a record set with FormX records related to this order.
You can either package this query up as a stored procedure using #orderId paramter, or you can use dynamic sql and substitute #orderId with real order number you executing your query for.

Resources