With Watson Conversation, is there a limit or best practice for the number of examples for a given intent? There is concern that too many examples for an intent might diminish accuracy.
Per the docs at IBM Clouds, Defining intents --
Intent limits:
The number of intents and examples you can create depends on your Conversation service plan
Standard/Premium service plan: 2,000 Intents per workspace
Lite service plan: 100 Intents per workspace
Both service plans above have 25,000 examples per workspace
EDIT
Also note: no mention of example limits is delineated within the pricing structure. That said, the following article is worth checking out: Planning your intents and entities
Related
Been reading up on Corda (no actual use yet) and other DLTs to see if we could use it in a project. What I was wondering after reading all the Corda key concepts: what would be the way to share data with everyone, including nodes that are only added later?
I've been reading things like https://corda.net/blog/broadcasting-a-transaction-to-external-organisations/ and https://stackoverflow.com/a/53205303/1382108. But what if another node joins later?
As an example use case: say an organization wants to advertise goods it's selling to all nodes in a network, while price negotiations or actual sales can then happen in private. If a new node joins, what's the best/easiest way to make sure they are also aware of the advertised goods? With blockchain technologies I'd think they'd just replicate the chain that has these facts upon joining but how would it work in Corda? Any links or code samples would be greatly appreciated!
You can share transactions with other nodes that have not previously seen them, but this sort of functionality doesn't come out of the box and has to be implemented using flows by the CorDapp developer.
As the author of ONIXLabs, I've implemented much of this functionality generally to make it easier for CorDapp developers to consume. There are quite a few feature-rich APIs available on GitHub.
In order to publish a transaction, the ONIXLabs Corda Core API contains functions that extend FlowLogic<*> to provide generalised transaction publishing:
publishTransaction called on the initiating-side of the flow, specifies the transaction to be published, and to whom.
publishTransactionHandler called on the initiated-by/handler side of the flow specifies the transaction to be recorded and who it's from.
As an example of how these APIs are consumed, take a look at the ONIXLabs Corda Identity Framework, where we have a mechanism for publishing accounts from one node to a collection of counterparties.
PublishAccountFlow consumes the publishTransaction function.
PublishAccountFlowHandler consumes the publishTransactionHandler function.
I read this document and among several very relevant topics, some of them are key to a scalability problem I am facing.
Basically the document states that it is possible to overcome the 1 per second update ratio per entity that basically me drove me to redis in a use case that would not demand me to do it.
"a (google) software engineer in the Datastore team had mentioned a technique to obtain much higher throughput than one update per second on an entity group"
"The basic idea of Job Aggregation is to use a single thread to process a batch of updates. Because there is only one thread and only one transaction open on the entity group, there are no transaction failures due to concurrent updates. You can find similar ideas in other storage products such as VoltDb and Redis."
This is very useful to me but I don't have any clue on how this works.
Just creating a service and serialising (pull queue) upserts to a specific Kind could solve the issue? How datastore could be sure that no other thread would suddenly begin to upsert?
Thanks
It is important to keep in mind that Job Aggregation is not part of Datastore. As the documentation says, you need to use a single batch of updates. You can take a look here Batch operations to know how to upsert multiple entities.
About your second question, Datastore is not the responsible to ensure that other thread begin to upsert, you must to ensure that this not happens to get a better performance.
Here Datastore best practices there are other best practices that Google recommends to get better performance.
I'm considering using Amazon's DynamoDB. Naturally, if you're bothering to use a highly available distributed data store, you want to make sure your client deals with outages in a sensible way!
While I can find documentation describing Amazon's "Dynamo" database, it's my understanding that "DynamoDB" derives its name from Dynamo, but is not at all related in any other way.
For DynamoDB itself, the only documentation I can find is a brief forum post which basically says "retry 500 errors". For most other databases much more detailed information is available.
Where should I be looking to learn more about DynamoDB's outage handling?
While Amazon DynamoDB indeed lacks a detailed statement about their choices regarding the CAP theorem (still hoping for a DynamoDB edition of Kyle Kingsbury's most excellent Jepsen series - Call me maybe: Cassandra analyzes a Dynamo inspired database), Jeff Walker Code Ranger's answer to DynamoDB: Conditional writes vs. the CAP theorem confirms the lack of clear information in this area, but asserts that we can make some pretty strong inferences.
The referenced forum post also suggests a strong emphasis on availability too in fact:
DynamoDB does indeed synchronously replicate across multiple
availability zones within the region, and is therefore tolerant to a
zone failure. If a zone becomes unavailable, you will still be able to
use DynamoDB and the service will persist any successful writes that
we have acknowledged (including writes we acknowledged at the time
that the availability zone became unavailable).
The customer experience when a complete availability zone is lost
ranges from no impact at all to delayed processing times in cases
where failure detection and service-side redirection are necessary.
The exact effects in the latter case depend on whether the customer
uses the service's API directly or connects through one of our SDKs.
Other than that, Werner Vogels' posts on Dynamo/DynamoDB provide more insight eventually:
Amazon's Dynamo - about the original paper
Amazon DynamoDB – a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications - main introductory article including:
History of NoSQL at Amazon – Dynamo
Lessons learned from Amazon's Dynamo
Introducing DynamoDB - this features the most relevant information regarding the subject matter
Durable and Highly Available. Amazon DynamoDB replicates its data over
at least 3 different data centers so that the system can continue to
operate and serve data even under complex failure scenarios.
Flexible. Amazon DynamoDB is an extremely flexible system that does
not force its users into a particular data model or a particular
consistency model. DynamoDB tables do not have a fixed schema but
instead allow each data item to have any number of attributes,
including multi-valued attributes. Developers can optionally use
stronger consistency models when accessing the database, trading off
some performance and availability for a simpler model. They can also
take advantage of the atomic increment/decrement functionality of
DynamoDB for counters. [emphasis mine]
DynamoDB One Year Later: Bigger, Better, and 85% Cheaper… - about improvements
Finally, Aditya Dasgupta's presentation about Amazon's Dynamo DB also analyzes its modus operandi regarding the CAP theorem.
Practical Guidance
In terms of practical guidance for retry handling, the DynamoDB team has meanwhile added a dedicated section about Handling Errors, including Error Retries and Exponential Backoff.
I need to build a reliable predictive dialer based on Asterisk. Currently the system we use includes Wombat and Asterisk, and we do not find this solution usable as Wombat provides a poor API and it's impossible to use it without regular manual operations.
The system we want:
Can be used solely via API or direct database queries (adding lists to campaigns, updating lists, starting campaigns, stopping campaigns etc.) so that it can be completely integrated into an existing product
Is free, or paid for annually independent to the usage rate
Is considered stable
Should be able to handle tens of thousands of calls per day, if it matters
Use vicidial.org or hire freelancer to build new core with your needed api.
You can also check OSdial for this, it also developed using asterisk.
We have been working with a preview of the next version of Wombat, through the Early Access program, and Wombat has a complete configuration and reporting JSON API and you can deploy it "headless" in order to scale up to thousands of parallel lines. If you ask Loway they can likely get you access to the Early Access program.
BTW, Vicidial is great for agent-based outbound, but imposes quite a large penalty on the number of agents per server - you cannot reasonably use it to do telecasting at the scale we are looking for as it would require too many servers. Wombat is leaner and can drive over one thousands channel per server. YMMV.
This question would be better placed on a "hire-a-freelancer" site like oDesk ... if you need custom programing done, those are the sorts of places to go to get manpower.
Your specifications are well within what is possible with Asterisk. I'd strongly recommend looking at Vici Dial and OS Dial as others have suggested; out of the can, they are pretty good.
The hard part of any auto-dialer is not the dialer, oddly enough. It's the prediction algorithms, the answering machine detection algorithms and the agent UI. Those are what makes or breaks an auto-dialer application for a company.
I am trying to understand the definitions in this document.
http://www.opengroup.org/soa/source-book/ontologyv2/service.htm
Their definitions of service, service interface and service contract are either unclear or seem different from what I normally encounter.
Service:
“A service is a logical representation of a repeatable activity that
has a specified outcome. It is self-contained and is a ‘black box’ to
its consumers.”
Lets say I have a WCF project and it has two Operations
StoreFront
+GetPrice
+AddToCart
The definition says "a repeatable activity". So is the service StoreFront? Or do I have two services (GetPrice and AddToCart).
Service Contract:
Has an "effect" class. Is the effect "return price" and " added to cart" ?
From the same article:
“A capability offered by one entity or entities to others using
well-defined ‘terms and conditions’ and interfaces.” (Source: OMG
SoaML Specification - my italics)
This is in my opinion a preferable defnition than the one talking about "repeatable activities".
The key word in the definition is capability. Capability refers to Business Capability which is a carry-over from the BPM industry, but in an SOA context refers to a business domain with distinct boundaries.
So from this definition we can surmise that services should be exposed or should operate within a business capability/process boundary. This leads us towards the idea (from the principals or tenants of SOA) that services should be autonomous within well defined boundaries.
In your example, you are asking
So is the service StoreFront? Or do I have two services (GetPrice and
AddToCart)
The answer to that as always is "it depends". However, generally Pricing (GetPrice) would belong to a different business capability to Ordering (AddToCart). Additionally, the operations differ in some other important ways:
GetPrice is a read operation, while AddToCart is a write operation.
GetPrice is a synchronous operation, while AddToCart could very well be asynchronous
So from these we should probably assume that they are two different services from a business perspective.
This assumption has some radical repercussions. If they are two services, then according to SOA they should be autonomous. Meaning that we should be looking to minimize coupling between the services in every possible way, so that as much as possible they can be planned, developed, tested, built, deployed, hosted, supported, and managerd as separate concerns.
Another repercussion is that when you physically separate services to this extent, how can you show this stuff together to your users? They may be different capabilities but they still need to work together on the screen.
Additionally, from a back end perspective Ordering needs to know about Pricing data, otherwise how can order fulfillment happen? If you've separated the database into two, how can the Checkout service know how much stuff costs, what discounts to apply, etc?
I have posted about this stuff before, so please feel free to have a read. I would recommend reading the excellent article on Microservices by Lewis and Fowler also.