Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
Is this Cosmos DB (SQL API) query:
SELECT * FROM c WHERE c.Name = 'John'
Faster or cheaper than
SELECT * FROM c WHERE c.Personal.Name = 'John'
I'm trying to understand the consequences of designing my data flat VS nested (not normalized vs de-normalized).
Thanks
The two versions of query you mention are probably very close in terms of cost, but the more important impact of model complexity in my experience is the cost of writes. Cosmos creates an index by default for every possible path from the item root. That means the more complex your model, the more paths will be indexed, which directly increases the cost of a write operation. As the indexing docs note:
By optimizing the number of paths that are indexed, you can
substantially reduce the latency and RU charge of write operations.
So if you embed a Personal item within your root item with multiple properties, you make your item more expensive to write.
There are also quite a few questions on StackOverflow from people asking how to write queries for their complicated object models, who never like the comment "why not a more straightforward model?" If you have the chance, avoid that fate. :)
In general, keeping items as simple and small as possible seems like the rule of thumb to follow. As always, test and see. The RU cost of queries are deterministic so you can directly know the impact of a change just by tweaking variables and running a quick test.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
In firestore sub-collections should be generated to save costs. In this case I intend to have a document for each user who registers and to go nesting the other data that I need from these. In this way, I will have only one reading per user who enters and the other information will be included in the sub-collections, according to the firestore documentation.
The future problem is the reports. If there are 100 users, if it is required to have a complete report with these, it would be 100 readings, when there are 50,000 users it will be the same. Additionally, although I do not know the topic of snapshots well, each of these will generate an additional cost for updates.
I would like if someone can support me with suggestions or help me clarify this:
Is it possible to have a main document that contains all the information and is this the one that is used for reports and for users get the data? That is, instead of having N documents for each user, have a single document "maindoc" and this would have subcollections with all the user data
Note: to complement solutions for reports such as data export to Bigquery and the API service, I do not consider them relevant since they also occupy N reads according to the number of documents
I will have only one reading per user who enters and the other information will be included in the sub-collections
Not really, since Firestore queries are shallow by nature, which means it does not return the value of the sub-collections when you get the document. Sub-collections are there to make data easier to understand, not to save cost. Maybe check this question out for more infomation.
The future problem is the reports
You get billed 0.06$ for every 100,000 document reads (that the price for my region, yours may differs), so unless you need to use the reports function mutiple times a day and having millions of documents, I think it's ok.
Is it possible to have a main document that contains all the information and is this the one that is used for reports and for users get the data?
This is a really bad idea, because you not only get billed for document reads, you also get billed for network egress, a.k.a the amount of network bandwidth that you use. Doing things this way mean every user have to download a giant document which slows down the app and take a lot of bandwidth.
Would it be a better option to look at SQL alternatives that have their cost base based on data size and not reads/writes?
This comes down to your use case. But for me, the different in pricing is not that much considering other BaaS options where Firebase documentations is very hard to beat.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 months ago.
Improve this question
Suppose we have a web service that aggregates 20 000 users, and each one of them is linked to 300 unique user data entities containing whatever. Here's naive approach on how to design an example relational database that would be able to store above data:
Create table for users.
Create table for user data.
And thus, user data table contains 6 000 000 rows.
Querying tables that have millions of rows is slow, especially since we have to deal with hierarchical data and do some uncommon computations much different from SELECT * FROM userdata. At any given point, we only need specific user's data, not the whole thing - getting it is fast - but we have to do weird stuff with it later. Multiple times.
I'd like our web service to be fast, so I thought of following approaches:
Optimize the hell out of queries, do a lot of caching etc. This is nice, but these are just temporary workarounds. When database grows even further, these will cease to work.
Rewriting our model layer to use NoSQL technology. This is not possible due to lack of relational database features and even if we wanted this approach, early tests made some functionalities even slower than they already were.
Implement some kind of scalability. (You hear about cloud computing a lot nowadays.) This is the most wanted option.
Implement some manual solution. For example, I could store all the users with names beginning with letter "A..M" on server 1, while all other users would belong to server 2. The problem with this approach is that I have to redesign our architecture quite a lot and I'd like to avoid that.
Ideally, I'd have some kind of transparent solution that would allow me to query seemingly uniform database server with no changes to code whatsoever. The database server would scatter its table data on many workers in a smart way (much like database optimizers), thus effectively speeding everything up. (Is this even possible?)
In both cases, achieving interoperability seems like a lot of trouble...
Switching from SQLite to Postgres or Oracle solution. This isn't going to be cheap, so I'd like some kind of confirmation before doing this.
What are my options? I want all my SELECTs and JOINs with indexed data to be real-time, but the bigger the userdata is, the more expensive queries get.
I don't think that you should use NoSQL by default if you have such amount of data. Which kind of issue are you expecting that it will solve?
IMHO this depends on your queries. You haven't mentioned some kind of massive writing so SQL is still appropriate so far.
It sounds like you want to perform queries using JOINs. This could be slow on very large data even with appropriate indexes. What you can do is to lower your level of decomposition and just duplicate a data (so they all are in one database row and are fetched together from hard drive). If you concern latency, avoid joining is good approach. But it still does not eliminates SQL as you can duplicate data even in SQL.
Significant for your decision making should be structure of your queries. Do you want to SELECT only few fields within your queries (SQL) or do you want to always get the whole document (e.g. Mongo & Json).
The second significant criteria is scalability as NoSQL often relaxes usual SQL things (like eventual consistency) so it can provide better results using scaling out.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
We are new to Scrum and part way through the first sprint we have realised that one of the team members (a developer) needs to do some investigation into how navigation should work (from a user perspective) in the application.
So at the end of this investigation we should have a proposal or prototype of how something should work. But it wont have been actually coded in the application.
So my question is, how should we deal with something like this in terms of the sprint planning. I don't really see it as being user story, but what is it, and how is it treated in Scrum? Does something need to be added to the planning board for the investigation?
Thanks
Paul.
Try to treat prototyping like any other requirement as much as possible. Think about what you want to achieve, create a user story, define one ore several tasks and estimate them during sprint planning. Think of the development team being the user in this case. Definitely have it on the planning board and track progress in daily Scrum meetings. If you have problems estimating the tasks, define them as "time-boxed", i.e. with the fixed time budget, to prevent "endless" work without results.
Although you got the solution Just wanted to add something here.
Such prototyping/researching works are termed as Spikes in the Agile world.
Here, the team dedicates some members into such spikes only so much as to understand the feasibility of the user story and be in a position to help the entire team estimate for the user story.
SCRUM is rather an organizational process than a development model, like prototype-driven development. It means that different X-driven-development models can be easily incorporated, like TDD or even prototype-driven (PDD).
To incorporate PDD in SCRUM, one can set several milestones that are prototype versions. SCRUM can be used normally considering each prototype as a whole new project. It is good for a complex prototype.
However, if creating a prototype is very easy, and a single person can do it in one or two sprints worth of time, so it might be useful to retain a prototype-specialist, that, much like the application-specialist, monitors the work of the rest of the team to check consistency with the ultimate goal. However, a prototype specialist can iteratively provide new prototypes, guiding the work of the rest of the team in a practical manner, differently from the application specialist.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have an asp .net 4.0 application. I have an mdf file in my app_data folder that i store some data. There is a "User" table with 15 fields and an "Answers" table with about 30 fields. In most of the scenarios in my website, the user retrieves some data from "User" table and writes some data to "Answers" table.
I want to test the performance of my application when about 10000 users uses the system.What will happen if 10000 users login and use the system at the same time and how will the performance is affected ? What is the best practice to test my system performance of asp .net pages in general?
Any help will be appreciated.
Thanks in advanced.
It reads like performance testing/engineering is not your core discipline. I would recommend hiring someone to either run this effort or assist you with it. Performance testing is a specialized development practice with specific requirement sets, tool expertise and analytical methods. It takes quite a while to become effective in the discipline even in the best case conditions.
In short, you begin with your load profile. You progress to definitions of the business process in your load profile. You then select a tool that can exercise the interfaces appropriately. You will need to set a defined initial condition for your testing efforts. You will need to set specific, objective measures to determine system performance related to your requirements. Here's a document which can provide some insight as a benchmark on the level of effort often required, http://www.tpc.org/tpcc/spec/tpcc_current.pdf
Something which disturbs me greatly is your use case of "at the same time," which is a practical impossibility for systems where the user agent is not synchronized to a clock tick. Users can be close, concurrent within a defined window, but true simultaneity is exceedingly rare.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I have an interview coming up for an entry level pl/sql developer job. I took a class in pl/sql but have not done any projects in pl/sql (other languages yes). I do know basic sql (joins subqueries etc), so i am wondering more about specific pl/sql information I should know.
I'd agree with SP -- if it's entry level, I'd worry more about the person's general knowledge, how well their personality meshes with the organization, and their willingness and aptitude to learn more.
I was doing a telephone interview for a PL/SQL programmer (with no advance warning), and for one of the questions, the candidate said he didn't know, but he was fairly sure the answer was in a given book. I accepted that as a right answer -- for entry level, admitting that you don't know everything is pretty important.
If you're asking what you should look over before the interview -- I'd say don't study too much, or you might stress out and make a bad impression. Normally, I'd look see if the person has good skills for the job at hand, but I don't know what they're hiring for ... so if you want to look at something, a good understanding of cursors and sql statement tuning go a long way.
I'd say for an entry level position you should have a degree and have taken a course or two on DBs. No experience necessary.
When interviewing PL/SQL developers for entry position I except:
0. Ability to write simple SQL queries for simple examples (joins, aggregates)
1. Understanding common concepts (triggers, indexes, sequences)
2. Understaning Shcema/User Oracle concepts, grants and synonyms
3. Understanding packages, procedures and functions concepts
4. Ability to write on PL/SQL as normal procedural language (assignments, loops, procdeures, types, etc)
If only 0. and 1. statisfied additional expirience on other big SQL server (Sybase, Microsoft) required. And time to learn after hiring, of course :)
P.S. At period of active entry-level programmers hiring (not now :( ) we require for entry level only degree in CS and ability to learn.
Who is Tom Kyte?
From a technical point of view, I would expect the candidate to be able to create tables, perform simple selects, joins, inserts, updates and deletes.
While interviewing them I would ask questions about working with dates and strings, cursors etc, asking more detailed questions until they couldn't answer. At that point I'd ask them what they would do to find an answer. With consulting the Oracle help files, asking a team member or web searches being acceptable answers.
Good luck with the interview.
Entry level applicants must demonstrate the following skills:
Ability to find the interview site successfully.
Can breathe in and out without prompting.
Demonstrate adequate control of voluntary bodily functions.
Successfully spell own name. Use of items such as driver's license, credit cards, etc, as aids is acceptable.
OK, but seriously...
I'd expect an entry-level applicant to be able to demonstrate some basic familiarity with programming (iteration, loops, subroutines). Give them a logic test - see how they do. Have them show that they can write some very basic DML queries. Polite - no attitude. Ability to listen. Ability to talk coherently. Dress and deportment reasonable for an office setting. (This means you can have all the tattoos and body piercings you want, and can wear the most eclectic clothing you like - but I won't be hiring you).