Async cassandra queries - asynchronous

I've been trying to update an app to enhance performance, it uses Datastax Cassandra Java Driver for DAL services.
Now, I need to convert sync queries to async queries, I didn't find satisfactory answers to my doubts (at least not on the web pages I visited).
Could anybody please answer the below queries or refer to a link where I could get an answer.
1> What are the possible problematic scenarios I need to worry about before changing synced executes to ansync execs?
2> How will reads and writes behave, can I change one of them without worrying about any issues?
Thanks in advance!

There are several things that you need to think about:
You need to rate limit your code - by default it's only 1024 request per connection, so you may need to increase this number. But even with increased number of request per connection it's easy to overload Cassandra, so you still need to control it, with something like this;
You need correctly handle errors - you may need to add error handler to promise that is returned, and react correspondingly;
You need correctly create statements - for example, don't reuse the same BoundStatement as it's not thread safe;
don't re-use same List/Set/Map instances that you pass as parameters, etc.

Related

Firebase - optimizing read / write

Consider a chat like implementation where clients write using a transaction on the head and read using on('child_added') listener.
When a client writes he will also get a read of the same version he sent which means a redundant transfer of that version from the database. In the case of only one connected client typing for example, all responses to the listener will be redundant.
I tried to optimize this by turning off the listener before writing and turning it on again when writing ended with a startAt(new head). This way I don't get the redundant read of the location that was sent.
This all works fine but I now don't know if the cost of removing and adding the listener may be high as well ? what is the best strategy here ?
Firebase automatically optimizes it for you. This is pretty much the standard use case; it's what Firebase was designed for. The best strategy is to leave the listener on. Let Firebase do its thing.

Why is dynamic real time not recommended as per asterisk?

In extconfig.conf they have mentioned that
"However, note that using dynamic realtime extensions is not recommended anymore as a best practice; instead, you should consider writing a static dialplan with proper data abstraction via a tool like func_odbc."
1) Why asterisk is not recommending dynamic realtime extensions?
2) How to do static dialplan with data abstraction using tool liek func_odbc?
My requirement is having have more extensions (in this case mobile number) coming up, how can I dynamically add them to sip.conf and get it registered to the SIP server
There are some issues with dynamic realtime
Most important issue is dialplan.
When/if you use EXACT dialplan like full number match - it work ok. But when you use pattern, it search for pattern in context. To do that it request all records in this context from db EVERY time when you access dialplan. That is really bad, but no easy way fix it. For example you have dialplan of 10 lines for pattern _011. and enother patterns/numbers in same dialplan, overal number of lines 1000. You call 011123456788, it request priority 1 line(db do 1000 rows check), after that priority 2(db do 1000 rows check). So you got 10x1000=10000 db rows for EVERY new call.
If you want dynamic dialplan with hi-load, use db config storage (for dialplan change,reload it for example once every 10 minutes) and check in extension/dialplan for features using func_odbc. That way you have much more control over sql query. Sure that require you understand mysql and able build queries, but no other way for any dynamic pbx with more then 10-20 calls.
sippeers realtime is other thing. It have issues with db update with enabled peer update, or not update peer info if cache enabled. You just have live with that.

Caching of data in a text file — Better options

I am working on an application at the moment that is using as a caching strategy the reading and writing of data to text files in a read/write directory within the application.
My gut reaction is that this is sooooo wrong.
I am of the opinion that these values should be stored in the ASP.NET Cache or another dedicated in-memory cache such as Redis or something similar.
Can you provide any data to back up my belief that writing to and reading from text files as a form of cache on the webserver is the wrong thing to do? Or provide any data to prove me wrong and show that this is the correct thing to do?
What other options would you provide to implement this caching?
EDIT:
In one example, a complex search is performed based on a keyword. The result from this search is a list of Guids. This is then turned into a concatenated, comma-delimited string, usually less than 100,000 characters. This is then written to a file using that keyword as its name so that other requests using this keyword will not need to perform the complex search. There is an expiry - I think three days or something, but I don't think it needs to (or should) be that long
I would normally use the ASP.NET Server Cache to store this data.
I can think of four reasons:
Web servers are likely to have many concurrent requests. While you can write logic that manages file locking (mutexes, volatile objects), implementing that is a pain and requires abstraction (an interface) if you plan to be able to refactor it in the future--which you will want to do, because eventually the demand on the filesystem resource will be heavier than what can be addressed in a multithreaded context.
Speaking of which, unless you implement paging, you will be reading and writing the entire file every time you access it. That's slow. Even paging is slow compared to an in-memory operation. Compare what you think you can get out of the disks you're using with the Redis benchmarks from New Relic. Feel free to perform your own calculation based on the estimated size of the file and the number of threads waiting to write to it. You will never match an in-memory cache.
Moreover, as previously mentioned, asynchronous filesystem operations have to be managed while waiting for synchronous I/O operations to complete. Meanwhile, you will not have data consistent with the operations the web application executes unless you make the application wait. The only way I know of to fix that problem is to write to and read from a managed system that's fast enough to keep up with the requests coming in, so that the state of your cache will almost always reflect the latest changes.
Finally, since you are talking about a text file, and not a database, you will either be determining your own object notation for key-value pairs, or using some prefabricated format such as JSON or XML. Either way, it only takes one failed operation or one improperly formatted addition to render the entire text file unreadable. Then you either have the option of restoring from backup (assuming you implement version control...) and losing a ton of data, or throwing away the data and starting over. If the data isn't important to you anyway, then there's no reason to use the disk. If the point of keeping things on disk is to keep them around for posterity, you should be using a database. If having a relational database is less important than speed, you can use a NoSQL context such as MongoDB.
In short, by using the filesystem and text, you have to reinvent the wheel more times than anyone who isn't a complete masochist would enjoy.

.NET vs SQL Server - best practise for repeated searching

My client is asking for a "suggestion" based lookup to be added to a particular form field.
In other words, as you start typing into a field there should be a "Google style" popup which suggests possible results to select from. There will be in the order of "tens of thousands" of possible suggestions - this is the best estimate I currently have on the quantity.
Using AJAX to send/retrieve the result, my question is whether it is better to store ALL the suggestions within .NET cache and process there, or whether it's better to run a stored-procedure based query on SQL Server for each request?
This would be using .NET 2.0 and SQL Server 2005
There is one trick I use every time when faced with such task. Do not do it on every keystroke. Put the launching of the search on a sliding timeout.
The intent here is to launch the search only when the user paused in his/her typing. Usually I set the timeout at .1 to .2 sec. Psychologically it is still instantaneous, but it considerably reduces the load on whatever you will use to search
Your bottle neck will be transporting the data from the server to the browser. You can easily produce a large result from the database in almost no time at all, but it takes forever to return to the browser.
You will have to find a way to limit the result that the server returns, so that you only fetch what the user has to see right now.
When you have the data traffic down to reasonable level, you can start looking at optimising the server part. With some caching and logic that should be easy, considering that when the user types the result is often a subset of the previous result, e.g. the match for "hat" is a subset of the match for "ha".
When I've seen the "suggest as you type" type searches done in SQL Server environments, I've seen the best performance using some sort of a caching mechanism, and typically a distributed approach - like a memcached, typically. Even if your code is optimized well, your database is tuned well and you have your query taking only a <= 10ms with the call to it, process and return, that is still 10ms as they type.
It depends on the number of items. If you can cache the items in a .NET process without running out of memory this will defenitely be faster.
But if that can't be done you are stuck with accessing the database on each request. A stored procedure would be a nice way to do that.
But there are some design patterns which can guide you. I've used the following two while developing something similar to what you are doing.
Submission throttling
Browser side cache

Using static data in ASP.NET vs. database calls?

We are developing an ASP.NET HR Application that will make thousands of calls per user session to relatively static database tables (e.g. tax rates). The user cannot change this information, and changes made at the corporate office will happen ~once per day at most (and do not need to be immediately refreshed in the application).
About 2/3 of all database calls are to these static tables, so I am considering just moving them into a set of static objects that are loaded during application initialization and then refreshed every 24 hours (if the app has not restarted during that time). Total in-memory size would be about 5MB.
Am I making a mistake? What are the pitfalls to this approach?
From the info you present, it looks like you definitely should cache this data -- rarely changing and so often accessed. "Static" objects may be inappropriate, though: why not just access the DB whenever the cached data is, say, more than N hours old?
You can vary N at will, even if you don't need special freshness -- even hitting the DB 4 times or so per day will be much better than "thousands [of times] per user session"!
Best may be to keep with the DB info a timestamp or datetime remembering when it was last updated. This way, the check for "is my cache still fresh" is typically very light weight, just get that "latest update" info and check it with the latest update on which you rebuilt the local cache. Kind of like an HTTP "if modified since" caching strategy, except you'd be implementing most of it DB-client-side;-).
If you decide to cache the data (vs. make a database call each time), use the ASP.NET Cache instead of statics. The ASP.NET Cache provides functionality for expiry, handles multiple concurrent requests, it can even invalidate the cache automatically using the query notification features of SQL 2005+.
If you use statics, you'll probably end up implementing those things anyway.
There are no drawbacks to using the ASP.NET Cache for this. In fact, it's designed for caching data too (see the SqlCacheDependency class http://msdn.microsoft.com/en-us/library/system.web.caching.sqlcachedependency.aspx).
With caching, a dbms is plenty efficient with static data anyway, especially only 5M of it.
True, but the point here is to avoid the database roundtrip at all.
ASP.NET Cache is the right tool for this job.
You didnt state how you will be able to find the matching data for a user. If it is as simple as finding a foreign key in the cached set then you dont have to worry.
If you implement some kind of filtering/sorting/paging or worst searching then you might at some point miss the quereing capabilities of SQL.
ORM often have their own quereing and linq makes things easy to, but it is still not SQL.
(try to group by 2 columns)
Sometimes it is a good way to have the db return the keys of a resultset only and use the Cache to fill the complete set.
Think: Premature Optimization. You'll still need to deal with the data as tables eventually anyway, and you'd be leaving an "unusual design pattern".
With event default caching, a dbms is plenty efficient with static data anyway, especially only 5M of it. And the dbms partitioning you're describing is often described as an antipattern. One example: multiple identical databases for multiple clients. There are other questions here on SO about this pattern. I understand there are security issues, but doing it this way creates other security issues. I've recently seen this same concept in a medical billing database (even more highly sensitive) that ultimately had to be refactored into a single database.
If you do this, then I suggest you at least wait until you know it's solving a real problem, and then test to measure how much difference it makes. There are lots of opportunities here for Unintended Consequences.

Resources