CosmosDb has a good feature of Globally Distributed which gives Faster Response of data. This will be useful for Mobile Applications directly accessing CosmosDb where Users are spread across the Globe.
However I am using ASP.NET Web Application hosted in Azure. Here my Application to Database communication will be of Fixed Distance always.
Can I benefit from CosmosDb in this case?
This is for Azure hosted ASP.NET Application
You can utilize CosmosDB when you know noSQL concept and so is your code, it has different implementation for read and write processes or you are planning to do microservices or you have other projects that depends/communicate on your Webapp project and your using the same database
There are some points you need to take into account before choosing CosmosDB as the database.
Pricing model! CosmosDB is not a cheep database and pricing model is based on the provisioned throughput. Requests that exceed the provisioned throughput will be rejected by the database. So first make sure you completely understand how things work.
Like other document based databases, if you wanna keep a graph of objects in a document, you should consider how to handle concurrent updates to the documents (if that is the case in your app). Hope you know well the difference between document based and relational databases.
But regarding the benefits:
It has a great a integration support with other PaaS services in Azure
It scales very well if you have a good partitioning strategy
Related
As far as I can work out, CosmoDB has the ability to make Graph queries using the Gremlin query language. Apart from that the pricing, marketing etc. all seem the same. It seems strange that they came up with a new product to add Gremlin when they didn't do the same to add MongoDB support. What are the discernable differences between these two products?
The Azure Cosmos DB team member here.
Azure Cosmos DB started as “Project Florence” in 2010 to address developer pain-points faced by large scale applications inside Microsoft. Observing that the challenges of building globally distributed apps are not a problem unique to Microsoft, in 2015 we made the first generation of this technology available to Azure developers in the form of Azure DocumentDB. Since that time, we’ve added new features and introduced significant new capabilities. Azure Cosmos DB is the result. It is the next big leap in globally distributed, at scale, cloud databases. As a part of this release of Azure Cosmos DB, DocumentDB customers, with their data, are automatically Azure Cosmos DB customers. The transition is seamless and they now have access to the new breakthrough system and capabilities offered by Azure Cosmos DB.
In the evolution of Cosmos DB, we have added significant new capabilities since 2015 (when DocumentDB was made generally available) but only a subset of these capabilities was available in DocumentDB. These capabilities are in the areas of the core database engine as well as, global distribution, elastic scalability and industry-leading, comprehensive SLAs. Specifically, we have evolved the Cosmos DB database engine to be able to efficiently map all popular data models, type systems and APIs to the underlying data model of Cosmos DB. The developer facing manifestation of this work currently will experience it via support for Gremlin and Table Storage APIs. And this is just the beginning… We will be adding other popular APIs and newer data models over time with more advances towards performance and storage at global scale.
We also have extended the foundation for global and elastic scalability of throughput and storage. One of the very first manifestations of it is the RU/m (https://learn.microsoft.com/en-us/azure/cosmos-db/request-units-per-minute) but we have more capabilities that we will be announcing in these areas. The new capabilities will help save cost for our customers for various workloads. We have made several foundational enhancements to the global distribution subsystem. One of the many developer facing manifestations of this work is the consistent prefix consistency model (making in total 5 well-defined consistency models). However, there are many more interesting capabilities we will release as they mature.
It is important to point out that we view Azure Cosmos DB as a constantly evolving database service. Typically, we first validate all new capabilities with the large scale applications inside Microsoft, subsequently expose them to key external customers, and finally, release them to the world.
It is also important to point out that DocumentDB’s SQL dialect has always been just one of the many APIs that the underlying Cosmos DB was capable of supporting. As a developer using a fully managed service like Cosmos DB, the only interface to the service is the APIs exposed by the service. To that end, nothing really changes for a DocumentDB customer. Cosmos DB offers the exactly the same SQL API that DocumentDB did. However, now (and in the future) you can get access to other capabilities which were previously not accessible.
DocumentDB is one of the APIs for CosmosDB. Others include Table Storage, MongoDB, Gremlin.
Think about CosmosDB as the database platform that handles scaling, throughput, consitency, etc and DocumentDB as one of the types of the databases than run on CosmosDB.
Azure Cosmos DB natively supports multiple data models including documents, key-value, graph, and column-family. The core content-model of Cosmos DB’s database engine is based on atom-record-sequence (ARS). Atoms consist of a small set of primitive types like string, bool, and number. Records are structs composed of these types. Sequences are arrays consisting of atoms, records, or sequences.
The database engine can efficiently translate and project different data models onto the ARS-based data model. The core data model of Cosmos DB is natively accessible from dynamically typed programming languages and can be exposed as-is as JSON.
https://learn.microsoft.com/en-us/azure/cosmos-db/introduction
CosmosDB is the new DocumentDB for NoSQL solution.
As Cosmosdb architect Rimma mentioned
The Azure Cosmos DB DocumentDB API or SQL (DocumentDB) API is now
known as Azure Cosmos DB SQL API. You don't need to change anything to
continue running your apps built with DocumentDB/DocumentDB API. The
functionality remains the same. Thanks.
DocumentDB is one of the APIs for CosmosDB.As of now, if you go to Azure portal and try to create an Azure Cosmos DB, you have to select one of the 4 APIs available there:
Gremlin (Graph)
MongoDB
SQL (DocumentDB)
Table (key-value)
I am starting to port one old desktop single tenant application into the cloud and wish to hear what would be your recommendation about the databases for my cloud-based multi-tenant application?
My basic requirement is simple:
For each tenant, its data is separate to any other tenants' data. I can easily backup, restore, export the data for one single tenant without affecting other tenants.
I don't really want to care about multi-tenancy in the business logic code. It should look like a single tenant application behind the security layer, no tenant ID pass around etc.
Easy to query using some mature technology like LINQ.
Availability and scalability, of course, easy to set up replicas, fail-over and scaling up and down etc.
I have gone through some investigations about multi-tenant application development. I have noticed SQL databases from Azure and AWS are both very expensive(the cost for just SQL database instance is close to the license fee of the original application), so I definitely can't use separate SQL database instances for tenants.
Now I'm reading this book Developing Multi-tenant Applications for the Cloud, 3rd Edition, and it uses Azure Storage Service to implement multi-tenancy. I haven't finished the book yet, it seems you still have to handle the multi-tenancy by yourself and the sample code is already out of date.
I have seen lots of SO questions compare Azure Table Storage with MongoDB. The MongoDB is very new to me, not sure whether it could be easily used to fulfill my requirements?
And I have seen RavenDB as well, it does support multi-tenancy out of box. But I didn't see some good sample code about how to use it in Azure app development.
Hope to hear some good advices from awesome SO guys.
I would better opt with RavenDB on top of MongoDB. Even Raven is a new comer in to the game, it supports most of the features which traditional SQL supports.
Also to make up a decisions the volume of data you are dealing is a also a key decision pointer. Also the amount of traffic you are expecting.
Also keep in mind that operational costs and development efforts. HA and DR scenarios can be problematic when you use Raven or Mongo because of the fact that you need to host them. But when it comes to Azure Storage, it by defaults protects you to a maximum extent by maintaining 3 copies of information.
So I would suggest you to carefully make the trade offs and opt wisely based on your business needs, cost optimization, development and operational effort.
Having a single instance of your application for each tenant is a very expensive way to implement an application, however I realise that if an application was developed with a single tenant in mind, then the costs of changing over can be high.
First can we start out with why you have a desktop application connecting to a database at another location. The latency can really slow down an application. Ideally you would want a locally installed database and have it sync with the cloud DB, or add in appropriate caching into your application.
However the DB would still need to differentiate the clients.
Why do you need this to go to a cloud database? Is it for backup purposes, not installing a DB locally on a clients machine, accessing the same data from many machines or something else?
Unless your application is extremely large, I would recommend rewriting it for multi-tenant to one SQL Azure database. The architecture chosen at the beginning of the project doesn't suit your requirements now. As you expand you will run into further issues.
First of all let me clear that I am not from a web background so if any of my understanding about how it works is not correct please feel free to correct me
Let's say I have a website which I would like to host on cloud because
- I don't want to take care of hardware
- I want to scale my website as needed
Now I am a bit confused between role of SQL Server vs role of SQL Azure in this case.
Normal Web Hosting
When I think of a normal website I know that I need a host/server on which my website will be hosted. The host should be able to support SQL Server. For scaling purpose I will have to host my website/ASP Pages on multiple servers. Similarly if I want to scale up my SQL Server I will have to host it on multiple servers and will have to make sure data is up to date in all servers through some mechanism.
Cloud Based Hosting
Now I think I can setup similar structure on Cloud/Azure as well. If yes, would I be using true capabilities of Cloud in this case?
Or should I use SQL Azure instead of SQL Server? What benefit would I get in that case? Would I be still be responsible for for scaling up and consistency of data? I know I can scale up the website by setting the number of VMs/instance but what about scaling of database?
Edit
Thanks to Florin Dumitrescu the terminology I wanted to use was Scaling Out because I am more concerned about the performance rather than how big my database is in terms of size. I am more concerned about how database would scale between different servers/systems to accommodate the load and hence result in better performance
SQL Azure, as Yossi mentioned, is a Database-as-a-Service. As such, you simply ask for it to be provisioned, magic happens, and you have a database that scales from 1GB to 5GB, 10GB, all the way to 50GB (soon to be 150GB as announced at SQL PASS). The nice thing about SQL Azure: you don't have to worry about any infrastructure, servers, licensing, etc. You simply connect with your connection string. SQL Azure is designed to be scalable to handle a considerable number of concurrent tenants, so you don't have to concern yourself with scaling.
SQL Azure also replicates its data in the data center, to provide "durable" storage. You still need to design a Disaster Recovery scheme, in case the data center becomes unavailable (and you can use the Data Sync service for that).
As far as your website itself: As you scale out to multiple instances, each instance runs the same code and uses the same resources. Taking this one step further, you can move your static (non-changing) web content, such as images and CSS, to Blob storage. This has several advantages over storing them with the website itself:
Ability to enable the Content Delivery Network, a worldwide edge-caching service providing better performance for your end users
Less strain on your web server instances, as requests for those images will now be directed to Blob storage, a completely separate URL than your website
Ability to update an image or stylesheet without having to re-deploy your application - simply upload a new file to Blob storage.
I highly recommend the Windows Azure Platform Training Kit, as there are labs that take you through the fundamentals of all of this, with complete code samples as well. This is updated almost monthly, staying in sync with the latest Windows Azure SDK and tools.
If you're hosting your web site in the cloud, and you need a database, than SQL Azure is almost certainly the best option.
SQL Azure is a database as a service, so you'll create your database and work against it from your code, but not have to worry about the provisioninig, there are no servers as such, it is all being taken care of.
From an application point of view it looks and behaves pretty much like SQL Server, so initially all that changes is the connecting string
As other noted SQL Azure takes away your concerns about setting up and taking care of the infrastructure. This is part of the premise of Azure in general which is to provide a platform rather than just Infrastructure.
The price you pay for that are some limitations on the capabilities (vs. regular SQL). Limitation on the size (at least until Federation will be available) and increased latency (since your database is not running on the same server of your app)
Microsoft Teched has as "SQL Azure Performance and Elasticity Guide" which you should probably take a look at
I am planning to develop a fairly small SaaS service. Every business client will have an associated database (same schema among clients' databases, different data). In addition, they will have a unique domain pointing to the web app, and here I see these 2 options:
The domains will point to a unique web app, which will change the
connection string to the proper client's database depending on the
domain. (That is, I will need to deploy one web app only.)
The domains will point to their own web app, which is really the
same web app replicated for every client but with the proper
connection string to the client's database. (That is, I will need to
deploy many web apps.)
This is for an ASP.NET 4.0 MVC 3.0 web app that will run on IIS 7.0. It will be fairly small, but I do require to be scalable. Should I go with 1 or 2?
This MSDN article is a great resource that goes into detail about the advantages of three patterns:
Separated DB. Each app instance has its own DB instance. Easier, but can be difficult to administer from a database infrastructure standpoint.
Separated schema. Each app instance shares a DB but is partitioned via schemas. Requires a little more coding work, and mitigates some of the challenges of a totally separate schema, but still has difficulties if you need individual site backup/restore and things like that.
Shared schema. Your app is responsible for partitioning the data based on the app instance. This requires the most work, but is most flexible in terms of management of the data.
In terms of how your app handles it, the DB design will probably determine that. I have in the past done both shared DB and shared schema. In the separated DB approach, I usually separate the app instances as well. In the shared schema approach, it's the same app with logic to modify what data is available based on login and/or hostname.
I'm not sure this is the answer you're looking for, but there is a third option:
Using a multi-tenant database design. A single database which supports all clients. Your tables would contain composite primary keys.
Scale out when you need. If your service is small, I wouldn't see any benefit to multiple databases except for assured data security - meaning, you'll only bring back query results for the correct client. The costs will be much higher running multiple databases if you're planning on hosting with a cloud service.
If SalesForce can host their SaaS using a multitenant design, I would at least consider this as a viable option for a small service.
After reading this question and the suggested link explaining when is more appropriate to use SQLite vs another DB it's still unclear to me one simple thing, and I hope someone could clarify it.
They say:
Situations Where SQLite Works Well
Websites
SQLite usually will work
great as the database engine for low
to medium traffic websites...
...
Situations Where Another RDBMS May
Work Better
Client/Server
Applications...
If you have many
client programs accessing a common
database over a network...
Isn't a website also a client/server app?
I mean I don't understand, a website is exactly a situation where I have many client programs (users with their web browsres) concurrently accessing a common DB via one server application.
Just to keep it simple: at the end of the day, is it possible for instance to use this SQLite for an ecommerce site or an online catalog or a CMS site with about 1000 products/pages?
The users' web browsers don't directly access the database; the web application does. And normally the request/response cycle for each page the user views will be very fast, usually lasting a fraction of a second.
IIRC, a transaction in SQLite locks the whole database file, meaning that if a web app request requires a blocking transaction, all traffic will effectively be serialized. This is fine, for a low-to-medium traffic website, because many requests per second can still be handled.
In a client-server database application, however, multiple users may need to keep connections open for longer periods of time, and may also need to perform transactions. This is far less of a problem for bigger RDBMS systems because locking can be performed in a more fine-grained way.
SQLite can allow multiple client reads but only single client write. See: https://www.sqlite.org/faq.html
Client/server is when multiple clients do simultaneous writes to the database, such as order entry where there are multiple users simultanously inserting and updating information, or a multi-user blog where there are multiple simultaneous editors.
A website, in the case of read-only, is not client/server but rather simply a server with multiple requests. In many cases, a website is heavily cached and the database is not even accessed, or rarely.
In the case of a slightly used ecommerce website, say a few simultaneous shoppers, this could be supported by SQLite, or by MySQL. Somewhere there is a line where performance is better for a highly-concurrent database as opposed to SQLite.
Note that the number of products/pages is not a great way to determine the requirement for MySQL over SQLite, rather it is the number of concurrent users, and at what point their concurrent behavior experiences slowness due to waiting for locks to clear.
A website isn't necessarily a client server application in the context of use.
I think when they say website, they mean that the web application will directly manage the database. That is, the database file will live within the web site and will not be access via any other means. (A single point of access, put simply)
In contrast, a client/server app may have the web site accessing the data store as well as another web site, SOAP client or even a smart client. IN this context, you have multiple clients access one database (server). This is where the web site would become (yet another) client.
Another aspect to consider when constrasting the two, is what is the percentage of writes compared to reads. I think SQLite will perform happiply when there is little writing going on compared to the amount of reads. SQLite, I understand, doesn't do well in a multiple write scenario. It's intended for a single (handful?) process to be manipulating it.
I mainly only use SQLite on embedded applications. (iOS, Android). For larger, more complex websites (like your describing) I would use something like mySQL.