total registered vs. concurrent users - web-deployment

Is there a proper way, equation or technique in general to say, "My web application needs to support N number of total users which via this equation/technique/rockHardExperience tells me that I need to support X number of concurrent page requests"?
From my research and/or gut feeling it seems like it would be something like:
totalLoadCapabilityRequired = (totalUsersN x .10 ) * .5
where .10 is for roughly 10% on at any given time
and the whole thing multiplied by 50% to suggest a 50% chance of those total users online executing a request at roughly the same time
any insights would help me in making sure I implement support in my application that is on par for the demand. I expect a lot of users but don't want to over anticipate too early. I know for starters that the org I am programming for will have 45,000 users that they want to use my system, with an anticipation on success for many more.

Here's a couple of things to think about:
What's the time span in which you expect the bulk of your visits? If it's an office application within the same physical company your capacity planning should be based on an 8 hour period. If most visits will come from the same continent you can plan for a 12 hour period instead, etc. Base your visitor spread on that.
Which pages do you anticipate will be the most popular and how heavy are those pages (i.e. how many pages can you load in one second)? Get an understanding of parts that would benefit from caching to squeeze out more performance.
Don't plan based on peak load; design your app to scale and start small.
Design your app in a way that you can take run snapshots at every 500th request; you can use tools like xhprof to create files that you can run through cachegrind tools to analyze the performance as it runs.
In short, there's no catch-all formula :) for a ballpark figure your formula will probably be good enough, but take the above points in consideration.

Related

Firebase App Check - How to fill in increased quota form?

I have a few questions regarding this, as I don't want to submit the application form without knowing how it works. I have a turn based online game with around 150k DAU. So that means there are alot of writes, reads and listeners. My (RTDB) load peaks at about 30% atm.
Default calls per day is 10,000. What happens when those run out, will the calls beyond that return errors?
Would it cost me alot more given the amount of users?
In regards to performance, is there something like a "known average percent increase" in read/write operations? It's described in the tutorial video as "instant" but everything takes a little time.
How do I know what amount to apply for? I'm guessing it's 1M-10M or 10M-100M but it's hard to know..
Thankful for any response :)

How to Sample Adobe Analytics (Omniture) Data

I can't find anything on the web about how to sample Adobe Analytics data? I need to integrate Adobe Analytics into a new website with a ton of traffic so the stakeholders want to sample the data to avoid exorbitant server calls. I'm using DTM but not sure if that will help or be a non-factor? Can anyone either point me to some documentation or give me some direction on how to do this?
Adobe Analytics does not have any built-in method for sampling data, neither on their end nor in the js code.
DTM doesn't offer anything like this either. It doesn't have any (exposed) mechanisms in place to evaluate all requests made to a given property (container); any rules that extend state beyond "hit" scope are cookie based.
Adobe Target does offer ability to output code based on % of traffic so you can achieve sampling this way, but really, you're just trading one server call cost for another.
Basically, your only solution would be to create your own server-side framework for conditionally outputting the Adobe Analytics (or DTM) tag, to achieve sampling with Adobe Analytics.
Update:
#MichaelJohns comment below:
We have a file that we use as a boot strap file to serve the DTM file.
What I think we are going to do is use some JS logic and cookies
around that to determine if a visitor should be served the DTM code.
Okay, well maybe i'm misunderstanding what your goal here is (but I don't think I am) but that's not going to work.
For example, if you only want to output tracking for 50% of visitors, how would you use javascript and cookies alone to achieve this? In order to know that you are only filtering 50%, you need to know the total # of people in play. By itself, javascript and cookies only know about ONE browser, ONE person. It has no way of knowing anything about those other 99 people unless you have some sort of shared state between all of them, like keeping track of a count in a database server-side.
The best you can do solely with javascript and cookies is that you can basically flip a coin. In this example of 50%, basically you'd pick a random # between 1 and 100 and lower half gets no tracking, higher half gets tracking.
The problem with this is that it is possible for the pendulum to swing 100% one way or the other. It is the same principle as flipping a coin 100 times in a row: it is entirely possible that it can land on tails all 100 times.
In theory, the trend over time should show an overall average of 50/50, but this has a major flaw in that you may go one month with a ton of traffic, another month with few. Or you could have a week with very little traffic followed by 1 day of a lot of traffic. And you really have no idea how that's going to manifest over time; you can't really know which way your pendulum is swinging unless you ARE actually recording 100% of the traffic to begin with. The affect of all this is that it will absolutely destroy your trended data, which is the core principle of making any kind of meaningful analysis.
So basically, if you really want to reliably output tracking to a % of traffic, you will need a mechanism in place that does in fact record 100% of traffic. If I were going to roll my own homebrewed "sampler", I would do this:
In either a flatfile or a database table I would have two columns, one representing "yes", one representing "no". And each time a request is made, I look for the cookie. If the cookie does NOT exist, I count this as a new visitor. Since it is a new visitor, I will increment one of those columns by 1.
Which one? It depends on what percent of traffic I am wanting to (not) track. In this example, we're doing a very simple 50/50 split, so really, all I need to do is increment whichever one is lower, and in the case that they are currently both equal, I can pick one at random. If you want to do a more uneven split, e.g. 30% tracked, 70% not tracked, then the formula becomes a bit more complex. But that's a different topic for discussion ( also, there are a lot of papers and documents and wikis out there published by people a lot smarter than me that can explain it a lot better than me! ).
Then, if it is fated that that I incremented the "yes" column, I set the "track" cookie to "yes". Otherwise I set the "track" cookie to "no".
Then in in my controller (or bootstrap, router, whatever all requests go through), I would look for the cookie called "track" and see if it has a value of "yes" or "no". If "yes" then I output the tracking script. If "no" then I do not.
So in summary, process would be:
Request is made
Look for cookie.
If cookie is not set, update database/flatfile incrementing either yes or no.
Set cookie with yes or no.
If cookie is set to yes, output tracking
If cookie is set to no, don't output tracking
Note: Depending on language/technology of your server, cookie won't actually be set until next request, so you may need to throw in logic to look for a returned value from db/flatfile update, then fallback to looking for cookie value in last 2 steps.
Another (more general) note: In general, you should beware sampling. It is true that some tracking tools (most notably Google Analytics) samples data. But the thing is, it initially records all of the data, and then uses complex algorithms to sample from there, including excluding/exempting certain key metrics from being sampled (like purchases, goals, etc.).
Just think about that for a minute. Even if you take the time to setup a proper "sampler" as described above, you are basically throwing out the window data proving people are doing key things on your site - the important things that help you decide where to go as far as giving visitors a better experience on your site, etc..so now the only way around it is to start recording everything internally and factoring those things in to whether or not to send the data to AA.
But all that aside.. Look, I will agree that hits are something to be concerned about on some level. I've worked with very, very large clients with effectively unlimited budgets, and even they worry about hit costs racking up.
But the bottom line is you are paying for an enterprise level tool. If you are concerned about the cost from Adobe Analytics as far as your site traffic.. maybe you should consider moving away from Adobe Analytics, and towards a different tool like GA, or some other tool that doesn't charge by the hit. Adobe Analytics is an enterprise level tool that offers a lot more than most other tools, and it is priced accordingly. No offense, but IMO that's like leasing a Mercedes and then cheaping out on the quality of gasoline you use.

Could "filling up" Google Analytics with millions of events slow down query performance / increase sampling?

Considering doing some relatively large scale event tracking on my website.
I estimate this would create up to 6 million new events per month in Google Analytics.
My questions are, would all of this extra data that I'm now hanging onto:
a) Slow down GA UI performance
and
b) Increase the amount of data sampling
Notes:
I have noticed that GA seems to be taking longer to retrieve results for longer timelines for my website lately, but I don't know if it has to do with the increased amount of event tracking I've been doing lately or not – it may be that GA is fighting for resources as it matures and as more and more people collect more and more data...
Finally, one might guess that adding events may only slow down reporting on events, but this isn't necessarily so is it?
Drewdavid,
The amount of data being loaded will influence the speed of GA performance, but nothing really dramatic I would say. I am running a website/app with 15+ million events per month and even though all the reporting is automated via API, every now and then we need to find something specific and use the regular GA UI.
More than speed I would be worried about sampling. That's the reason we automated the reporting in the first place as there are some ways how you can eliminate it (with some limitations. See this post for instance that describes using Analytics Canvas, one my of favorite tools (am not affiliated in any way :-).
Also, let me ask what would be the purpose of your events? Think twice if you would actually use them later on...
Slow down GA UI performance
Standard Reports are precompiled and will display as usual. Reports that are generated ad hoc (because you apply filters, segments etc.) will take a little longer, but not so much that it hurts.
Increase the amount of data sampling
If by "sampling" you mean throwing away raw data, Google does not do that (I actually have that in writing from a Google representative). However the reports might not be able to resolve all data points (e.g. you get Top 10 Keywords and everything else is lumped under "other").
However those events will count towards you data limit which is ten million interaction hits (pageviews, events, transactions, any single product in a transaction, user timings and possibly others). Google will not drop data or close your account without warning (again, I have that in writing from a Google Sales Manager) but they reserve to right to either force you to collect less interaction hits or to close your account some time after they issued a warning (actually they will ask you to upgrade to Premium first, but chances are you don't want to spend that much money).
Google is pretty lenient when it comes to violations of the data limit but other peoples leniency is not a good basis for a reliable service, so you want to make sure that you stay withing the limits.

How to update SQL Server database every 1 minute?

I have a SQL Server database which contains stock market quotes and other related data.
This database needs to be updated at regular interval, say 1 minute.
My question is:
How do I get stock quotes every 1 minute and update it to database?
I really appreciate your help.
Thanks!
You know, you seriously put the question from the wrong side. Like "I have a car, Mercedes, Coupe - how can I find the best road from A to B". Totally unrelated to the car.
Same with your question - this is not a sql or even an asp.net question to start with. The solution is independant of both, the sql server used and your web technology. Your main question is:
How do I get stock quotes every 12 minute and update it to the database?
Here we go. I assume you (a) talk of US stocks and (b) mean all of them, not a handfull.. 1 minute is too small an interval to make scanning things like yahoo.com feasible - main problem here is that there are tousands of stocks (actually more in the tens of thousands), and you dont want to go to yahoo scrapping thousands of pages per minute.
Same time, a end retail user data feed provider will not work. They support X symbols at a time, and x being typcially in the low hundred area, sometimes upgradable to 500 or so.
If you need STOCK DATA every minute, as per all US stocks, then this is technically identical to "real time prices", which ends up costing money. In adition you need a commercial higher end data feed of which I know of... one. Sorry. Costs going to be near or full four digit, without (!) publication rights.
And that is NxCore - their system has a data offer that offers US Stocks (all exchanges) real time, complete feed with all corretions etc. Native and C# wrapper API, so you can take the real time data feed, update your current pricing in memory and write them out to sql server every minute. Preferably not from asp.net (baaaaad choice for something that should run 24/7 without interruption unless you do heavy setup changes etc.) but from an installed windows service. Takes some bandwidth - no real idea how much (I am getting 4 exchanges from them, but no stocks, only the cme group futures, CME, CBOT, NYMEX and COMEX).
Note that wwith this setup you can go faster, too, but if you go fully real time you need a serious server. We talk of a billion updates or so per day...
End user sql server setup (i.e. little ram, and few slow discs) wont work.
Too expensive? Ther are plenty of data feeds around for a lower price, but they will not give you "stocks" as in "all of them", just "a selection".
If you are ok with not real time data - i.e. pulling stuff down at the end of the day, eoddata.com has a decent offer. YOu could also thnen pull things up via an asp.net page, but again.... you will not have the data during the day, just - well - after close. Smallest granularity is 1 minute. Repluublication rights again a no - but probably you can talk to them.
This isn't really SQL Server specific; a typical solution is that your run a process that polls an external source (a web service or the like) at regular intervals and uses this information to update the database. You can either implement this as a simple command-line program that gets executed every minute from the task scheduler, or you can make it a windows service that sleeps most of the time and only wakes up once a minute to do its processing. Once you have that, writing to the database is as usual.

asp.net web site developer pricing

i have been approached to build some websites for a few small businesses. They want a basic out of the box database driven website with some standard stuff (users, authentication, a few dynamic pages, etc). i am going to use asp.net mvc for this.
they have asked me how much i charge for this. my question, is that i have no frame of reference here. should i charge for the project a flat fee or a per hour charge. where do i start here to help determine correct pricing for a website project.
Charge an hourly fee that is about 3x the hourly rate you would command in a full-time job. The 3x multiplier basically evens things out for the benefits, etc. that you won't get as a 1099 employee.
Whatever you do, no matter how "Standard" it sounds. Do not charge a flat fee. Under that arrangement they have no incentive to curb feature creep. Even if you agree to a really tight spec up front, it is a recipe for disaster because it forces you to renegotiate every time they want something more. Under an hourly arrangement feature-creep works to your advantage.
Also, don't discount the hourly rate if you are a novice. Just don't bill unproductive hours. It is much easier to ease into billing more hours later than renegotiating the price per hour.
Charge per hour.
-- edit
So attempt to 'quote' it by estimating the number of hours. Make sure your estimate is conservative.
A nice approach is, in your head, consider the 'min', 'max', 'standard' type of time. Then use that to estimate the real time it will take you.
If you know that they know what they want and won't change the specs on you, go for a lump sum. That way you can work quickly.
If they are prone to change their minds and don't know what they want, go for an hourly fee. That way you won't be stuck working on their project for months without additional pay when they can't decide on exactly what they want.
I admit that I don't know much about this issue. However, I would still like to warn about the whole charge-per-hour mentality. While this approach basically protects the developer, it doesn't work well with the business owner:
Charge-per-hour, to the business owner, is a liability, whereas fix-price is just a cost. That's one.
The second thing is, if you are charging per hour, how are you going to justify your "research time"? Are you going to charge that as well? But business owner doesn't like to pay for research time. Or you can stick to your old trick and do something that has been reinvented N times and charge for the amount of time you spend. But that would seem unethical to some.
I have billed both by the hour and by the project. It's been my experience that customers are happier with project based billing instead of hourly billing.
With that in mind, I always pad the project cost by an amount I feel will cover the times when the client decides to change their mind. Further, I keep the project plan pretty simple. For example, I don't write 4 pages on how the login screen will work. Instead, It's a single bullet point: "Login Page". This allows both them and I a little flexibility.
Because I keep things simple AND I allow time for flexibility AND the clients know how much it's going to cost up front, my client's are happier and I can keep better track of my income. Also, I keep in pretty close contact with them. As long as you can keep the relationship good, you'll have a long term client.
Of course, it takes a bit of self discipline in combination with experience to know how long things take to build. Along these lines I never experiment on a client's dime. When I write the proposal, I already know what I'm going to use to get the job done and I've used those tools before. Because of this I can say with confidence that a login page will take a certain amount of time to put together.
Next, don't bite off more than you can chew. If it's a big project, break it up into smaller deliverables with their own pay schedule. That way the client (or you) can decide to walk away at any point. For example, if you think the project will take 3 months, break it up into 3 pieces. Incidentally, this helps with cash flow.
Finally, don't discount your time when getting started. That scares people.
I have a flat rate I charge for sites and outline exactly what they will get and then anything beyond that outline gets charged at an hourly rate. The hard part of this is if your getting into a project where your not sure how long it will take then you might want to break down the various pieces and then add at least 10 hours to that estimate. You don't want to sell yourself short but you also don't want to overcharge the customer. Be sure your clear up front that once the site is delivered then all changes are per hour or based on a maintenance fee structure.
Good luck.

Resources