Datastore namespacing: What is the best approach for constructing these for hierarchical representations? - google-cloud-datastore

I'm trying to efficiently manage the namespaces for a workflow variable management project in the context of using them as Datastore namespaces, implementing protobuf 'package' lineage, etc. Our project has a concept of variables, some global, some at a job level and some at a task level.
I'm struggling a bit in the most effective way to represent the namespace so that the hierarchy is represented properly. Having the name space be 'deploy' friendly between environments was critical, but easy since the 'environment' makes sense to head up the hierarchy.
Below that, my gut tells me this structure of namespace is well represented and should work, but I'd like other opinions.
env_namespace = dev
vars_namespace = dev.ourproject.vars
global_namespace = dev.ourproject.vars.global
job_namespace = dev.ourproject.vars.global.job
task_namespace = dev.ourproject.vars.global.job.task
Does the above code appear correct/efficient with a use-case like Datastore?

Datastore Namespaces are billed by cloud datastore documentation as for usage in multi-tenant applications. What this means in practice is, that you can have no queries spanning more than a single namespace. Also, backup and restore / loading several namespaces into BigQuery and other things are a general pain to deal with.
While prefix queries can be made to work in google cloud datastore, they do not work particularly well.
A more conventional way would be to store entities with several fields representing your Information and use different Kinds (tables) to represent your Data.

Related

Is this a bad DynamoDB database schema?

After a watching a few videos regarding DynamoDB and its best practices, I decided to give it a try; however, I cannot help but feel what I'm doing may be an anti-pattern. As I understand it, the best practice is to leverage as few tables as possible while also taking advantage of GSIs to do some 'heavy' lifting. Unfortunately, I'm working with a use case that doesn't actually have strictly defined access patterns yet since we're still in early development.
Some early access patterns that we may see are:
Retrieve the number of wins for a particular game: rock paper scissors, boxing, etc. [1 quick lookup]
Retrieve the amount of coins a user has. [1 quick lookup]
Retrieve all the items that someone has purchased (don't care about date). [Not sure?]
Possibly retrieve all the attributes associated with a user (rps wins, box wins, coins, etc). [I genuinely don't know.]
Additionally, there may be 2 operations we will need to complete. For example, if the user wins a particular game they may receive "coins". Effectively, we'll need to add coins to the user "coins" attribute & update their number of wins for the game.
Do you think I should revisit this strategy? Additionally, we'll probably start creating 'logs' associated with various games and each individual play.
Designing a DynamoDB data model without fully understanding your applications access patterns is the anti-pattern.
Take the time to define your entities (Users, Games, Orders, etc), their relationship to one another and your applications key access patterns. This can be hard work when you are just getting started, but it's absolutely critical to do this when working with DynamoDB. How else can we (or you, or anybody) evaluate whether or not you're using DDB correctly?
When I first worked with DDB, I approached the process in a similar way you are describing. I was used to working with SQL databases, where I could define a few tables and rely on the magic of SQL to support my access patterns as my understanding of the application access patterns evolved. I quickly realized this was not going to work if I wanted to use DynamoDB!
Instead, I started from the front-end of my application. I sketched out the different pages in my app and nailed down the most important concepts in my application. Granted, I may not have covered all the access patterns in my application, but the exercise certainly nailed down the minimal access patterns I'd need to have a usable app.
If you need to rapidly prototype your application to get a better understanding of your acecss patterns, consider using the skills you and your team already have. If you already understand data modeling with SQL databses, go with that for now. You can always revisit DynamoDB once you have a better understanding of your access patterns and determine that your application can benefit from using a NoSQL databse.

Meteor : Buisness Object

I started Meteor a few months ago.
I would like to know if using cursor.observeChanges for buisness objects is a good idea
I want to separate operations and views so I can uses the same operations in many views/events, and I want to know if it is a good idea.
Someone told me, we should not separate operations on mongo from view.
So my question is : Is it a good idea to to Buisness Objects with Meteor ?
Tanks for reading me.
cursor.observeChanges is essentially what you get behind the scenes when you do normal find() queries and bind to template helpers due to its context being reactive.
In the meteor world, the traditional model/view/controller paradigm is shifted towards a reactive data-on-the-wire concept including features like latency compensation.
What you refer to as a business object is basically a representation of your business data which is strongly typed, has a type of its own, atomic, and has only one task of representing.
You can achieve that kind of separation of concerns in any language/framework, including meteor. That only depends on how you lay out, structure and abstract your code.
What Meteor brings into the equation is the toolset to build up an interface to your data with modern ux features that are otherwise very hard/expensive to get.
The only concern over business-class applications could be the fact that Meteor currently employs MongoDB by default. MongoDB has its own discussions around business applications whether they need transaction support, ad-hoc aggregation, foreign key relationships etc. But that is another topic.

JDO vs possible programming language change

I have been thinking about using JDO in my new project. However, there is always the possibility of changing the Programming language, or rewriting / adding some modules in other languages. But i fear that in that case the data may not be accessible (parseable, understandable).
Are my fears justified?
If you persist data using JDO to say RDBMS, then the data is held in RDBMS (in a logical well-defined structure) which is equally accessible using SQL via a different programming language. The datastore is what holds the data, not some mysterious "JDO" thing, which is simply an API to get the data there from a Java app and Java objects.

Can ASP.NET performance be improved with modules/static classes?

Can using Modules or Shared/Static references to the BLL/DAL improve the performance of an ASP.NET website?
I am working of a site that consists of two projects, one the website, the other a VB.NET class library which acts as a combination of DAL and BLL.
The library is used to communicate with databases and sometimes transform/validate the data going into/coming from the DBs.
Currently each page on the site that needs db access (vast majority) will create an instance of the relevant class in the library to access specific tables.
As I understand it this leads to a class from the library being instantiated and garbage collected for each request, with the possibility of multiple concurrent instances if multiple users view the same page.
If I converted the classes to modules (shared/static class) would performance increase and memory be saved as only one instance of each module exists at a time and a new instance is not having to be created for each request?
(if so, does anyone know if having TableAdapters as global variables in the modules would cause problems due to threading?)
Alternatively would making the references to the Library class it the ASP.NET page have the same effect? (except I would have to re-write a lot less)
I'm no expert, but think that the absence of examples of this static class / session object model in books and online is indicative of it being a bad idea.
I inherited a Linq-To-Sql application where the db contexts were static, and after n requests the whole thing just fell apart. The standard model for L2Sql is the Unit-of-Work pattern (define a task or set of tasks - do them and close). Let the framework worry about connection pooling and efficient GC.
Are you just trying to be efficient or do you have performance issues? If the latter it's usually more effective to look at caching or improving query efficiency (use stored procedures, root out queries in loops) than looking at object instantiation.
Statics don't play well with unit tests either (another reason why they have dropped out of fashion).
instances are only a problem if they are not collected by the CG (a memory leak). Instances are more flexible than static as well because you can configure the instance to the specific context you are using.
When an application has poor performance or memory problems its usually a sign that
instances are not properly released (IDisposable)
the amount of data retrieved is too big (not paging large sets of data)
a large number of queries are executed (select n+1, or just a lot of queries)
poorly constructed sql statements (missing indexes, FK, too many joins, etc)
too many remote calls (either to other servers, or disk)
These are first things I would check. then start looking at the number of instantiated objects. Chances are that correcting the above mentioned list will solve most performance bottlenecks.
Can using Modules or Shared/Static references to the BLL/DAL improve
the performance of an ASP.NET website?
It's possible, but it depends heavily on how you use your data. One tradeoff in using a single shared instance of an object instead of one per request is that you will need to apply locking unless the objects are strictly read-only, and locking can both slow things down and complicate your code (not to mention being a common source of bugs).
However, if each object is going to contain the exact same data, then the tradeoff may be worth it -- even more so if it can save a DB round-trip.
You might consider using either a Singleton or a small number of parameterized objects rather than a static, though -- and use caching to manage them. That would give you the flexibility to let go of objects that you no longer need, which is harder to do when you're dealing with statics.

Architecture with NHibernate and Repositories

I've been reading up on MVC 2 and the recommended patterns, so far I've come to the conclusion (amongst much hair pulling and total confusion) that:
Model - Is just a basic data container
Repository - Provides data access
Service - Provides business logic and acts as an API to the Controller
The Controller talks to the Service, the Service talks to the Repository and Model. So for example, if I wanted to display a blog post page with its comments, I might do:
post = PostService.Get(id);
comments = PostService.GetComments(post);
Or, would I do:
post = PostService.Get(id);
comments = post.Comments;
If so, where is this being set, from the repository? the problem there being its not lazy loaded.. that's not a huge problem but then say I wanted to list 10 posts with the first 2 comments for each, id have to load the posts then loop and load the comments which becomes messy.
All of the example's use "InMemory" repository's for testing and say that including db stuff would be out of scope. But this leaves me with many blanks, so for a start can anyone comment on the above?
post = PostService.Get(id);
comments = post.Comments;
Traversing the Model like that is achievable, and desirable. You are absolutely right though that it is an implementation ideal that comes at a performance price, especially when dealing with collections (and you won't have any hair left when it comes to hierarchic data structures).
You'll want to configure NH mappings to do batched lazy loading. (fetch=subselect batch-size=#), otherwise eager loading will pull back too much data, and lazy loading will results in an N+1 selects problem (1 query to fetch the posts, + N queries to fetch comments where N is the number of posts - your loop).
If your requirement is really to show 2 comments for each post, a batchsize of 2 will do, but as you'll no doubt have guessed, as soon as your app tries to access the 3rd comment, NH will perform another select to fill the comments collection with 2 more, so you might want a bigger batch size from the outset. Plan for a perf tuning phase when you know your use cases. This may be very difficult if you are developing a general purpose data access API. (Also, you will want to add an order-by="SOME_COLUMN_NAME" on your comments collection mapping to control how to get the 'first' comments). It's easy to underestimate the importance of the NH mappings settings; ORM solves many dev problems, but adds a whole world of new ones.
'Domain Driven Design' by Eric Evans defines the repository pattern & services. They are not always appropriate. I've stopped using them for all but the very complex projects, and rarely on MVC builds. The benefits of the repository pattern & services are separation, isolation, testability and flexibility of your architectural components. In real-world terms - consider your 'usings' namespaces. If you would rather avoid having 'using nhibernate' in your controllers, then hide it away in a repository and just reference the repo assembly.
This ties in with testability - you can unit test your repo in isolation from the controllers. If you are now offended by having repo references in your controllers then employ a service layer. It's all about loose coupling.
Benefits of a service layer include complete hiding of the data access mechanics, exposing the service methods remotely over other transport options (web services, for instance), and veiling generic repository methods with API friendly names. For example, post = MyAwesomeAPI.PostService.Get(id); might simply be a wrapper to a generic - get any type by id - Repository.Get(id); This API wrapping is massively useful when developing a set of services for 3rd parties to consume, or just other devs on your team. Provided your method signatures stay the same, you can change the underlying implementation at any time - your switch from NH to plain SQL for example would not break existing apps that consume that API.
For maximum flexibility you would not even link your services assembly to your repo implementation assembly at all. Rather you would use a dependency injection tool like Structure Map to wire everything up at runtime. This allows you to switch repo implementations by configuration alone without recompiling/linking. You could even have multiple data access techniques. The consumer of the API would not know, nor should it care.
If you don't need any of those things, put 'using nhibernate' in your controllers and be done. The risk is that you have tightly tied your MVC app to NH and everyone needs to know everything to do make the smallest change to your app. That decision will likely be made by your project constraints (time/money/people/calendar). If you do need all those things, check out Sharp architecture, or assemble your own stack. MVC is so much more VC than M.

Resources