Howto handle Whisper's pre-allocation in a cloud environment? - graphite

I'm setting up a metrics infrastructure and I really like the powerful Graphite api. However, Whisper, the storage backend, does not currently work well for us due to it's disk pre-allocation feature. We run a cloud-based architecture where our apps change host/ip a lot, and since we want the host as part of each metric the Whisper db grows quickly.
How should I setup Graphite/Whisper to handle this situation?
I've also tried to find alternatives to Whisper, but nothing stands out. There are a lot of discussions and half-done solutions for other storage engines, but nothing that seems mature and provides a solid Graphite integration.

Turns it can't be done with Whisper.
I ended up using Cyanite as replacement for Carbon/Whisper, while still keeping Graphite (Graphite-API actually..).

Related

How to avoid the forge model derivative queue

I want to use the forge viewer as a preview tool in my web app for generated data.
The problem I have is that the model derivative API is sometimes slow sometimes fast.
I read that this happens because the files are placed in a queue and being processed subsequentially.
In my opinion, this can be solved by:
Having the extraction.update webhook also tell me where I am in the queue. So I can inform my users with better progress information. Or when the queue is too long I can not stop the process.
Being able to have a private queue. I have no problem paying more credits if necessary.
Being able to generate svf2 files on my own server.
But I don't know if any of these options are possible. Or if there is another workaround.
Yes, that could be useful. I logged that request in our system: DERI-7940
Might be considered later on, but no plans currently
I'm not aware of any plans for that
We're always working on making the translation service better, but unfortunately, I cannot tell when it will meet your requirements - including the implementation of the webhook feature you mentioned.
SVF2 is specifically for very large models - is that what you are working with? If not, then I'm quite certain that translating to SVF would be faster.

WSO2 ESB Clustered databases and Data Services

I could setup a cluster of wso2 ESB, following the dedicated doc, with a manager and two workers.
I am not sure about two points:
Does each worker node need it's own REGISTRY_LOCAL database ?
With both workers using the same db it works, but I'm not sure it is the way to do and the doc isn't clear about that.
Adding Data Services as a feature ?
Almost no docs about that, but not being able to fetch more than one row is a big limitation for me, so is it possible to add this feature in a clustered environment or is it better to separate the Data Services Servers from the ESB ones ?
If someone has experience in that kind of stuff I will really appreciate a feedback.
Thanks
You can use the same registry database for all the members in the cluster so that they can communicate with each other using that. You may refer my blog [1] for further information. Personally I have never used local registry DBs for each member in the cluster setup.
You cannot fetch multiple records when you install the DSS feature into the ESB. Also installing the feature will incur some performance overheads in the ESB. Therefore I thoroughly recommend you to use a separate DSS instance to get your work done. It also separates the concerns clearly which sounds good.
[1] http://ravindraranwala.blogspot.com/2015/09/wso2-esb-worker-manager-cluster-without.html

How to build a predictive dialer?

I need to build a reliable predictive dialer based on Asterisk. Currently the system we use includes Wombat and Asterisk, and we do not find this solution usable as Wombat provides a poor API and it's impossible to use it without regular manual operations.
The system we want:
Can be used solely via API or direct database queries (adding lists to campaigns, updating lists, starting campaigns, stopping campaigns etc.) so that it can be completely integrated into an existing product
Is free, or paid for annually independent to the usage rate
Is considered stable
Should be able to handle tens of thousands of calls per day, if it matters
Use vicidial.org or hire freelancer to build new core with your needed api.
You can also check OSdial for this, it also developed using asterisk.
We have been working with a preview of the next version of Wombat, through the Early Access program, and Wombat has a complete configuration and reporting JSON API and you can deploy it "headless" in order to scale up to thousands of parallel lines. If you ask Loway they can likely get you access to the Early Access program.
BTW, Vicidial is great for agent-based outbound, but imposes quite a large penalty on the number of agents per server - you cannot reasonably use it to do telecasting at the scale we are looking for as it would require too many servers. Wombat is leaner and can drive over one thousands channel per server. YMMV.
This question would be better placed on a "hire-a-freelancer" site like oDesk ... if you need custom programing done, those are the sorts of places to go to get manpower.
Your specifications are well within what is possible with Asterisk. I'd strongly recommend looking at Vici Dial and OS Dial as others have suggested; out of the can, they are pretty good.
The hard part of any auto-dialer is not the dialer, oddly enough. It's the prediction algorithms, the answering machine detection algorithms and the agent UI. Those are what makes or breaks an auto-dialer application for a company.

Memory quota exceeded using Wordpress on Shared Azure Website

I'm trying to wrap my head around a memory quota violation. In the wild, if I have a vm and I try to run something beyond its memory limits (SSMS, for instance, on my VPS), SSMS simply crashes and says "not enough memory, dude."
Apparently on Microsoft Azure, if you request a function that takes you beyond your allocated memory... IT TURNS YOUR SITE OFF FOR AN HOUR.
I can't explain how awful that is, and from the other similar questions I've seen about Azure memory quotas, most of you can't either. BUT...
Is there anyone out here with Wordpress experience on Azure who knows how to keep memory usage down? Alternatively, is there anyone here with Wordpress experience on any platform who can explain what kinds of activities might draw more than 512Meg at a time?
Any help would be good help.
Thanks.
Closing this question because as the first responder said -- there isn't a satisfactory answer. I ended up going with a different hosting company that offers dedicated WP hosting, and have had no issues whatsoever.
I love MS. I use their technology stack whenever feasible, but sometimes you gotta call a spade a spade: I am not sold on Azure yet, though not for lack of trying.

Best way to keyword search Amazon SimpleDB using EC2 and Asp.Net?

I am wondering if anyone has any thoughts on the best way to perform keyword searches on Amazon SimpleDB from an EC2 Asp.Net application.
A couple options I am considering are:
1) Add keywords to a multi-value attribute and search with a query like:
select id from keywordTable where keyword ='firstword' intersection keyword='secondword' intersection keyword = 'thirdword'
Amazon Query Example
2) Create a webservice frontend to Katta:
Katta on EC2
3) A queued Lucene.Net update service that periodically pushes the Lucene index to the cloud. (to get around the 'locking' issue)
Load balance Lucene(StackOverflow post)
Lucene on S3 (blog post)
If you are looking for a strictly SimpleDB solution (as per the question as stated) Katta and Lucene won't help you. If you are looking for merely an 'Amazon infrastructure' based solution then any of the choices will work.
All three options differ in terms of how much setup and management you'll have to do and deciding which is best depends on your actual requirements.
SimpleDB with a multi-valued attribute named Keyword is your best choice if you need simplicity and minimum administration. And if you don't need to sort by relevance. There is nothing to set up or administer and you'll only be charged for your actual cpu & bandwidth.
Lucene is a great choice if you need more than keyword searching but you'll have manage updates to the index yourself. You'll also have to manage the load balancing, backups and fail over that you would have gotten with SimpleDB. If you don't care about fail over and can tolerate down time while you do a restore in the event of EC2 crash then that's one less thing to worry about and one less reason to prefer SimpleDB.
With Katta on EC2 you'd be managing everything yourself. You'd have the most flexibility and the most work to do.
Just to tidy up this question... We wound up using Lightspeed's SimpleDB provider, Solr and SolrNet by writing a custom search provider for Lightspeed.
Info on implementing ISearchEngine interface for Lightspeed:
http://www.mindscape.co.nz/blog/index.php/2009/02/25/lightspeed-writing-a-custom-search-engine/
And this is the Solr Library we are using:
http://code.google.com/p/solrnet/
Since Solr can be easily scaled using EC2 machines, this made the most sense to us.
Simple Savant is an open-source .NET persistence library for SimpleDB which includes integrated support for full-text search using Lucene.NET (I'm the Simple Savant creator).
The full-text indexing approach is described here.

Resources