How should like a structure on dynamodb? Using nested structure is correct? - amazon-dynamodb

The same question is like in the topic.
When using AWS documentation, they recommend that you keep your related data in one table. But reading the various comments from other users, for the most part, I saw not to use nested objects (list of maps). So what is it really supposed to look like?

When we use maps/lists, we are often modeling one-to-many relationships. For example, here's an example where I model the relationship between Users and their Hobbies.
This strategy works fine to model one-to-many relationships if you don't have any access patterns around searching users by hobby. For example, if your access pattern includes "fetch all users who like photography", this is not a good way to store your data. Instead, you may consider storing hobbies like this
This data model allows you to fetch users by hobby, something you couldn't do in the prior example.
There is no single way to model data in DynamoDB. Instead, there are common strategies and patterns that model various relationships (one-to-many, many-to-many, etc). The best strategy for any particular situation will depend on the needs of your application.

Related

Save user info in one or multiple documents

I am currently overthinking something that should be trivial.
I need to save user info that is composed of different sections like settings, weather, profile.
Should I be saving each in a different document or everything under one document?
For me it kind of seems ambiguous (since these docs wont be queried) and I am scared I am missing or not considering something.
There is no singular right way to model data in Firestore, nor in most other NoSQL databases. It all depends on the use-cases of your app, and you likely don't know all of those yet. So I find it best to just pick an approach, and see how far it gets you.
If you want to train yourself to get better at this, I recommend reading NoSQL data modeling and watching Getting to know Cloud Firestore, which cover many scenarios of data modeling in Firestore (always in the context of a specific set of use-cases).

Using LMDB to implement a sqlite-alike relational database, relevant resources?

For educational reasons I wish to build a functional, full, relational database. I'm aware LMDB was used to be the storage backend of sqlite, but I don't know C. I'm on .NET and I'm not interested in just duplicate a "traditional" RDBMS (so, for example, I not worry about implement a sql parser but my own custom scripting language that I'm building), but expose the full relational model.
Consider this question similar to "How I implement a programming language on top of LLVM" before worry about why I'm not using sqlite or similar.
From the material I read, LMDB look great, specially because It provide transactions and reliability, plus the low-level plumbing. How that translate to changes that could touch several rows at several tables is another question..
Exist material that explain how is implemented a relational layer on top of something like LMDB? Is using LMDB (or their competitors) optimal enough or exist another better way to get results?
Is possible to use LMDB to store other structures like hashtables, arrays and (the one I'm more interested for a columnar database) bitmap arrays?, ie, similar to redis?
P.D: Exist a forum or another place to talk more about this subject?
I had this idea too. You should realize that this is tons of work and most likely no one will care. I haven't built full-blown relational db as this is crazy to do for one person. You could check it out here
Anyway I've used leveldb (and later rocksdb) and so you have keys-values sorted by key, ability to get value by key, iterate keys, have atomic writes of many values (WriteBatch) and consistent view of data at given time - snapshots. These features are enough to build correct thread-safe reading of table rows (using snapshots), correct writing of data and related indexes - all or nothing (using writebatch) and even transactions.
Each column has it's on disk index - keys sorted by values - so you could efficiently do various operations on it and keys with values themselves so you could efficiently read values with given id.
This setup is efficient for writing and reading using available operations on tables with little data (say less than a million rows). However, if table grows iterating over many keys can become not so fast. To solve this and to add a group-by statement I've decided to add memory indexes, but that's another story. So all-in-all it might be fun idea but in reality a lot of work and often frustrating results - why would you want to do that?

How to deal with complex database using an ORM on Android?

I can't find how to deal properly with complex databases using ORM on Android. I tried to find an Open Source project to see how it works but can find one that suits what i'm looking for...
I learned about relational databases some years ago and worked on SQL Server and Oracle databases, huge ones. The first things i learned when designing a database is to avoid having several times the same data. The second things i learned is never do in code what you can do with SQL. So I'm facing several problems with Android and ORM since it looks like you abolutely have to use an ORM in Android to be a good developer...
Let's take an example and say we have 100 buildings with 50 people in each of these building, all buildings has a different address. I want to get all people with their building address. I can't put this in one table else the same strings will exists many times in the database. Since on each adress there are 50 people if I use only one table I will have the same address string 50 times for each building, so I create another table with only buildings and make a relationship between these two tables. This is a trivial case but i saw many times Android app storing the same data many times in one or two tables.... what the point to use relational database if you replicate data ?
Again this is a simplistic example but when you have 20 or 30 tables with complex relationships the query in ORM styles can quickly become unreadable compare to SQL. Therefore not all SQL join types are generally supported by ORM. Then you use SQL raw query but what's the point to use an ORM since you can't use the object mapping since you're not returning a table you can map to a class but the result of a query... or maybe there is something I didn't understand. What the point to use an ORM if you don't use the relationnal object mapping advantage or make the queries difficultly maintainable ?
I saw a lot of code too where the ORM is used to get data in several tables and then the filtering and joining part is made using code... what's the point to use a relational database if you have to do this in code ? actually doing this some years ago what seen as the worse thing to do... but now I saw it so many times on Android...
So another solution is to create a View in the db and map my object to this view. I can use the power of SQL and the power of relationnal object mapping of the ORM. But several ORM doesn't support Views, like GreenDao who is one of the most used ORM today as far as I know...
All the example i can find here and there are not dealing with complex databases or has this kind of bad practices. Or at least it was condidered as bad practices for years... does it changed ?
So what's the best way to deal with "complex" databases on Android ?

NoSQL: new kind of relationships using Arrays?

I had to manage relationships between documents over a NoSQL engine (Couchbase), and I figured out this way to solve my problem. Here is my solution and the steps that let me realize it:
https://forums.couchbase.com/t/document-relationships-using-arrays-and-views-passing-though-graph-theory/3281
My questions are:
What do you think about this solution?
Have you ever used something like this? How is it working?
Are there any better ideas? Critical points of this solution should be helpful
Thank you.
Interesting post Matteo. After reading it I realized that you can possibly improve on few aspects:
Consider 1-1 node relationships. In your post you focus on N-N node
relationships (sure one can argue that 1-1 is a subset of
N-N)...however I think there is a potential of having a different (optimized) implememgtaion for 1-1 relationships. for 1-1 I use node key
value as a field in my json doc (e.g. user: {name:string, dob:date,
addressID:string})
Node key design to address relationships: You can encode in the key
value relationship information, e.g. key: "user#11", "user#11#address#123", "address#123#user#11", etc.
Data integrity aspects: Take into consideration missing complex
transactions. i.e. you can't mutate several documents in one
transaction. The design should compensate for that.
I have used similar solution in my model design for Couchbase in the past. Its now in production for several years already and its performing just fine (load is about 250 tps)...I was trying to avoid as much as possible creating complex node relations and ended up having very few 1-1 and 1-N types.
I tested out this solutions and works well. I like the flexibility of the 'always possible' N-N relationships, because you can simply add the relationship document when you need it without changing the application logic. There is a drawback: you need to implement your own application logic constraints to avoid relationships abuse.
I noticed that using arrays there isn't a great advantage compared to JSON objects and sometimes it may be useful to have other relationships data, for example the weight (or cost) of the relationship. So I suggest you to use a relationship document that as it's own type:
{
"type": "relationship",
"documents": ["key1", "key2"],
"all-the-data-you-need": { ... }
}
Looking at the performance there isn't so much difference using objects over arrays.
Hope this helps someone! ;)

Passing dataset to different layers(design related)

i read in one article that its not a good practice to pass dataset between different layers of .net web application.(DAL->BAL->Pages vice versa).Is that correct?
please give your suggestions.
Thanks
SNA
On the one hand, the problem with datasets and datatables is that they expose database implementation details like column names and types outside of your data access layer. Change a column name in your database or query and odds are that change is propogated to your dataset as well, forcing a re-compile of any tier that uses the dataset. So if you retrieve data into a dataset you should convert it to use strongly-typed business objects before passing it on.
On the other hand, a dataset doesn't care what kind of database it belongs to. You can use them with access, oracle, sql server, mysql, anything. So there is some generic-ness there that can make them useful when passing data between tiers. And just like the business layer shouldn't care about database details the data layer shouldn't really need to know what the the business objects are, so there's a good argument that you should use them for data interchange at that level.
My normal procedure is to have a sort of one-way "translation" tier between the business and data access layers, so that the business layer only deals with business objects and the data layer only returns generic data. This currently takes one of two forms:
I'll write my data access methods to return datatables or datareaders, the the translation tier will use a factory pattern to convert those rows into the desired strongly-typed business objects.
or
I'll use C# iterator blocks to convert a datareader into an IEnumerable<IDataRecord> in the data access layer and the translation tier will use them to change that IEnumerable<IDataRecor> into an IEnumerable<MyBusinessObject>, such that the code only ever iterates over the result set one time.
There is nothing wrong with passing around datasets but it's not a great practice.
Pros:
Easy to pass around and use in .NET apps
No having to code wrapper classes
Lots of functionality built into DataSets
Cons:
Data type that is not really type safe.
Your data field names can change all parts of your app will compile fine until they blow up at runtime.
Heavy object. Dataset does a ton of stuff and you probably don't need 90% of it.
Having non .NET apps talk to your DAL or BAL is going to be very clean.
There's nothing wrong about passing DataSets from your DAL to your BAL.
I think this stackoverflow question on DAL best practices sums up the two schools of thought pretty well.
I am in the middle of a "discussion"
with a colleague about the best way to
implement the data layer in a new
application.
One viewpoint is that the data layer
should be aware of business objects
(our own classes that represent an
entity), and be able to work with that
object natively.
The opposing viewpoint is that the
data layer should be object-agnostic,
and purely handle simple data types
(strings, bools, dates, etc.)
There is no problem with passing dataset across layers. If you observe, you will notice that passing dataset is by reference and not by value.So there is no issue of performance here.
Now what you read is also right, but you have to understand the context. If you are passing the dataset across remote boundaries, that is not a recommended practice.
There's nothing fundamentally wrong with that doing that. Although the basic idea of having a DAL, BLL and UI layer is so that each layer can abstract what's beneath it. E.g. the BLL shouldn't have any knowledge of how the database is structured because the DAL abstracts that away. If a dataset is being loaded in the DAL then passed straight through the BLL to the pages, it kind of sounds like the BLL is pointless.
The strongest statements often seen about DataSet is not to pass it into or out of a web service. That goes beyond exposing implementation details, and includes exposing details of the platform (.NET).
Although it's possible to change "table" and "column" names in a DataSet from those in the underlying database, you're still largely stuck with the underlying structure of the database. To abstract that, I would use Entity Framework. It allows you, for instance, to define a "Customer" entity which takes data from multiple tables and puts it into a single entity. Code using the entity doesn't need to know whether it is implemented as one table, two tables, or whatever.
Even there, you should not pass these entities outside of a web service boundary. They still pass implementation details outside of the implementation. For instance, properties of the base classes get serialized, even though these are just implementation details.
As far as I've understood, the DataSet requires the db connection to be open, for as long as it is used, which will reduce performance in your application as it keeps the connection open until the content is rendered.
Instead, I recommend using generic collections, such as IEnumerable<myType> or IQueryable<myType>, where myType is a custom type which you fill with your data.

Resources