Is there a good way to design recurring events (datetime) in DynamoDB?
In a relational database, this can be done using a recursive method.
The only way I can think of right now is to create a record for each event out to some far future date.
Edit:
The question was too vague so I am updating it for clarity. What I am referring to is a Recurrence Rule from rfc5545 Internet Calendaring and Scheduling Core Object Specification
(iCalendar)
Something like what this would create
RRULE:FREQ=WEEKLY;BYDAY=MO;COUNT=X
I am just wondering what other people have done in DynamoDB to handle the Recurrence Rule in their calendaring.
Related
After a watching a few videos regarding DynamoDB and its best practices, I decided to give it a try; however, I cannot help but feel what I'm doing may be an anti-pattern. As I understand it, the best practice is to leverage as few tables as possible while also taking advantage of GSIs to do some 'heavy' lifting. Unfortunately, I'm working with a use case that doesn't actually have strictly defined access patterns yet since we're still in early development.
Some early access patterns that we may see are:
Retrieve the number of wins for a particular game: rock paper scissors, boxing, etc. [1 quick lookup]
Retrieve the amount of coins a user has. [1 quick lookup]
Retrieve all the items that someone has purchased (don't care about date). [Not sure?]
Possibly retrieve all the attributes associated with a user (rps wins, box wins, coins, etc). [I genuinely don't know.]
Additionally, there may be 2 operations we will need to complete. For example, if the user wins a particular game they may receive "coins". Effectively, we'll need to add coins to the user "coins" attribute & update their number of wins for the game.
Do you think I should revisit this strategy? Additionally, we'll probably start creating 'logs' associated with various games and each individual play.
Designing a DynamoDB data model without fully understanding your applications access patterns is the anti-pattern.
Take the time to define your entities (Users, Games, Orders, etc), their relationship to one another and your applications key access patterns. This can be hard work when you are just getting started, but it's absolutely critical to do this when working with DynamoDB. How else can we (or you, or anybody) evaluate whether or not you're using DDB correctly?
When I first worked with DDB, I approached the process in a similar way you are describing. I was used to working with SQL databases, where I could define a few tables and rely on the magic of SQL to support my access patterns as my understanding of the application access patterns evolved. I quickly realized this was not going to work if I wanted to use DynamoDB!
Instead, I started from the front-end of my application. I sketched out the different pages in my app and nailed down the most important concepts in my application. Granted, I may not have covered all the access patterns in my application, but the exercise certainly nailed down the minimal access patterns I'd need to have a usable app.
If you need to rapidly prototype your application to get a better understanding of your acecss patterns, consider using the skills you and your team already have. If you already understand data modeling with SQL databses, go with that for now. You can always revisit DynamoDB once you have a better understanding of your access patterns and determine that your application can benefit from using a NoSQL databse.
So I was wondering, the one downside I have to NoSQL is: if my front end app ever drastically changes then I would have a horrible time remodeling my database. This is because NoSQL is designed with front end in mind first. So if front end changes, back end changes (at least that is the general idea)
So my idea is, would it be smart to store all my ORIGINAL/PURE copies of documents in multiple root collections. And then create "views" collections which are the collections my app will call. What I like about this is that my data is always "SQL" at root if I ever need to change my front end. But my "views" are actually what my app will use.
This is a lot like the dimension/reference table and fact table design people use.
The big reason once again for this idea: if my front end changes drastically, then I need to do serious work converting these "views" to other "views". Where with my idea, you would just delete your old "views" and create new "views" using your "sql"/"root" reference tables.
Do I make sense? :) I have not used NoSQL (but I am building something now with it so my brain is still battling with SQL to NoSQL haha). So if this is a "dude don't worry about it case" then you can give that as an answer as well haha
Yup, that is indeed a fairly common approach. I recent answers about NoSQL data modeling I started calling this out explicitly:
Make sure you have a single point of definition for each entity/value.
Make sure all other occurrences of that same value are derived from #1.
With these two in mind, fanning out/duplicating the data is a fairly straightforward process (literally: as it's unidirectional), and can easily be redone by wiping the derived data and rerunning the fan-out process.
Some good pointers to learn more about NoSQL data modeling:
NoSQL data modeling
Getting to know Cloud Firestore
And these previous questions:
Maxing out document storage in Firestore
How to write denormalized data in Firebase
Understanding NoSQL CRUD calls
I started Meteor a few months ago.
I would like to know if using cursor.observeChanges for buisness objects is a good idea
I want to separate operations and views so I can uses the same operations in many views/events, and I want to know if it is a good idea.
Someone told me, we should not separate operations on mongo from view.
So my question is : Is it a good idea to to Buisness Objects with Meteor ?
Tanks for reading me.
cursor.observeChanges is essentially what you get behind the scenes when you do normal find() queries and bind to template helpers due to its context being reactive.
In the meteor world, the traditional model/view/controller paradigm is shifted towards a reactive data-on-the-wire concept including features like latency compensation.
What you refer to as a business object is basically a representation of your business data which is strongly typed, has a type of its own, atomic, and has only one task of representing.
You can achieve that kind of separation of concerns in any language/framework, including meteor. That only depends on how you lay out, structure and abstract your code.
What Meteor brings into the equation is the toolset to build up an interface to your data with modern ux features that are otherwise very hard/expensive to get.
The only concern over business-class applications could be the fact that Meteor currently employs MongoDB by default. MongoDB has its own discussions around business applications whether they need transaction support, ad-hoc aggregation, foreign key relationships etc. But that is another topic.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 months ago.
Improve this question
Suppose we have a web service that aggregates 20 000 users, and each one of them is linked to 300 unique user data entities containing whatever. Here's naive approach on how to design an example relational database that would be able to store above data:
Create table for users.
Create table for user data.
And thus, user data table contains 6 000 000 rows.
Querying tables that have millions of rows is slow, especially since we have to deal with hierarchical data and do some uncommon computations much different from SELECT * FROM userdata. At any given point, we only need specific user's data, not the whole thing - getting it is fast - but we have to do weird stuff with it later. Multiple times.
I'd like our web service to be fast, so I thought of following approaches:
Optimize the hell out of queries, do a lot of caching etc. This is nice, but these are just temporary workarounds. When database grows even further, these will cease to work.
Rewriting our model layer to use NoSQL technology. This is not possible due to lack of relational database features and even if we wanted this approach, early tests made some functionalities even slower than they already were.
Implement some kind of scalability. (You hear about cloud computing a lot nowadays.) This is the most wanted option.
Implement some manual solution. For example, I could store all the users with names beginning with letter "A..M" on server 1, while all other users would belong to server 2. The problem with this approach is that I have to redesign our architecture quite a lot and I'd like to avoid that.
Ideally, I'd have some kind of transparent solution that would allow me to query seemingly uniform database server with no changes to code whatsoever. The database server would scatter its table data on many workers in a smart way (much like database optimizers), thus effectively speeding everything up. (Is this even possible?)
In both cases, achieving interoperability seems like a lot of trouble...
Switching from SQLite to Postgres or Oracle solution. This isn't going to be cheap, so I'd like some kind of confirmation before doing this.
What are my options? I want all my SELECTs and JOINs with indexed data to be real-time, but the bigger the userdata is, the more expensive queries get.
I don't think that you should use NoSQL by default if you have such amount of data. Which kind of issue are you expecting that it will solve?
IMHO this depends on your queries. You haven't mentioned some kind of massive writing so SQL is still appropriate so far.
It sounds like you want to perform queries using JOINs. This could be slow on very large data even with appropriate indexes. What you can do is to lower your level of decomposition and just duplicate a data (so they all are in one database row and are fetched together from hard drive). If you concern latency, avoid joining is good approach. But it still does not eliminates SQL as you can duplicate data even in SQL.
Significant for your decision making should be structure of your queries. Do you want to SELECT only few fields within your queries (SQL) or do you want to always get the whole document (e.g. Mongo & Json).
The second significant criteria is scalability as NoSQL often relaxes usual SQL things (like eventual consistency) so it can provide better results using scaling out.
Is it possible to use Neo4J's Cipher Query language (or another declarative language) but still reference custom code snippets (for instance to do custom WHERE-clauses based on, say, the result of a ElasticSearch/Lucene search?)
If other GraphDB's have declarative languages that support this, please shoot. I'm in no way bound to Neo4J.
Background:
I'm doing some research whether to include Neo4J in my current stack, which in the backend already consists of ElasticSearch, MongoDB and Redis.
Particulary with Redis' fast set-intersection capability, I could potentially create some rude graph-like querying. (although likely not as performant as a graphDB). I'm a long way in defining a DSL, with the type of queries to support.
However, I'm designing a CMS so contenttypes, and the relationships between these contenttypes which I would like to model with a graph are not known beforehand.
Therefore, the ideal case, of populating the needed Redis collections (with Mongo as source) to support all my quering based on Contenttypes and their relationships that are not known at design time, will be messy to say the least. Hope you're still following.
Which leads me to conclude that another solution may be needed, which is why I'm looking at GraphDb'd and Neo4J in particular (If others are potentially better suited for my use-case do shoot)
If you model your content-types as nodes you don't need to know them beforehand.
User-defined functions in javascript are planned for cypher later this year.
You can use a language like gremlin to declare your functions in groovy though.
You can store the node-id's in redis and then pass in an array of id's returned by redis to a cypher query for further processing.
start n=node({ids})
match n-[:HAS_TYPE]->content_type<-[:HAS_TYPE]-other_content
return content_type, count(*)
order by count(*) desc
limit 10
parameters: {"ids": [1,2,3,5]}