NLU doesnt recognize few names - watson-conversation

I am working on a chatbot that uses both Watson conversation and NLU. My chatbot is designed to provide information about other people - friends, colleagues (ex: their current position, contact number etc.,). Chat bot isnt able to recognize few names as persons. How do I handle this situation ? Any thoughts ?

NLU does recognize person pretty good, if you are using Watson Conversation along with NLU, I am not sure how you are combining both of them. Can you explain it a bit more?
I faced a similar situation using NLU, here are the few ways I overcame it
1.NLU relies on context, instead of sending smaller texts, send bigger texts.
2. It also relies on Grammatical features, usually having correct punctuation, First letter capital names for Person, City Names helped me get better results

Related

How to specify gender in Google Cloud Translation API

I am using Google Cloud Translation API in one of my projects. I want to specify the gender for the translation. I am unable to find about this in Google Cloud Translation. I have also searched a lot on the Internet but not found any way to do this. I know how to specify the gender in Google Text to Speech API using the SSML, but I need it for the translation. Any help will be highly appreciated.
After much searching I have discovered that there is currently no way to do this.
I have made a feature request along these lines at the invitation of GCP support.
The documentation indicates that feature requests are prioritised by how often an issue is starred, so for now my best answer is to star the issue here so that they know how many people are interested in this.
Looking for the same...
As it is NMT (Neural Machine Translation), it reacts to context.
I tried many combinations and found that this works well so far (says, not 'to', not 'talk').
Examples are EN > ES
However, sometimes its effect doesn't reach far in the translation.
So you have to stick the 'prefix' before each sentence.
Sometimes you get irregular behavior (see lower case "estoy"). And when you change something irrelevant (to you, but not to the model) ... buala!
So the final version (for now) is:
I guess the point is:
Understanding how it works (Machine Learning Language Models)
The Model (Algorithm) they use is evolving, so you need to keep an eye, as what works today may break tomorrow.
Once you get the response you will have to filter out you 'prefix', but that is not too difficult.
Please comment if you find better ways (or the API gets updated).
Related info: https://ai.googleblog.com/2018/12/providing-gender-specific-translations.html

Alexa - build custom slot for addresses

I am creating in which a user can say an address (for further processing). An address can be anything from "New York" to "123 First Avenue Washington" to "Seattle Harbor". Basically like something you can enter at Google Maps - it will recognize more or less everything :)
So now of course comes the problem on how to create a custom slot for this? LITERAL is deprecated PLUS I am working on a German language skill.
Should I actually try to fill the 50,000 lines I got available for a custom skill with as many enumerations of addresses as I can come up with? I'm afraid that even if I go down that road, Alexa will still try to map any input that's not in that list to one that is - and thereby rendering my skill a bit mood :(
Thanks for any advise!
As you suggest, using a custom slot with 50K sample addresses wouldn't really work. Something as complicated as an address really needs a built-in slot type, and there is one for US skills:
https://developer.amazon.com/public/solutions/alexa/alexa-skills-kit/docs/built-in-intent-ref/slot-type-reference#postaladdress
But you noted that you are targeting a German language skill and as far as I know there isn't a German language or address version of the above built-in slot yet.
The fact that they have done it for US suggests that they will add it for Germany at some point, but counting on that is risky, of course, so you are in a difficult position. In the mean-time I would suggest you go to the feature request space and add a request for a German version of the above:

Pre-Built Intents for Watson Conversation

After setting up the IBM Watson Conversation service on my Raspberry Pi today I was disappointed to see that I'd have to write out every possible input (intent) and output (entity). Chalk this up to my extreme naivety around machine learning, But isn't there a way to tie into an existing set of conversation capabilities?
For example, I'm sure Watson already knows all the words for Hello and their proper responses. Or how to answer a variety of silly questions. Is there any way to tap into the Watson we all saw on Jeopardy?
Thanks for your help!
There are a number of options here.
System Entities
These are pre-defined to allow you to understand certain common concepts. Numbers, Currency and Dates are the available ones at the moment, but there are more coming.
Entites
You can also pull public lists and import as CSV. For example: http://data.okfn.org/data/core/country-list . You may need to check the licensing before using though.
Intents
As you mentioned there is no pre-defined intents. But there are two solutions available that augment conversation.
First is "Watson Virtual Agent". This is a SaaS that contains pre-defined training sets for certain industries, as well as custom UI you can slot into your application. It's not cheap, but you can get a trial to play with it.
The other option is "Project Intu". It's still experimental, but it's purpose is to help in building robots/IOT devices. It contains pre-defined chit chat and some off topic stuff.
They have a "TJ Bot" project which can be used with a raspberry pi.

How Does an RSS Feed Work?

How's it going?
I've found a lot of more detailed answers relating to specific problems relating to RSS feeds, but I can't really figure out how you USE one, basically.
Could someone explain?
I see the RSS feed icon at the top of a lot of Wordpress sites, including my own, but when I click it, it just seems to be a long XML file. I don't know what to do with it, or even why it would be there.
How do you use this? Are you meant to hit it with an API request, or is there a particular kind of software that you use?
Cheers
Before telling you what RSS, let me describe you a common problem that many people have.
Say there is a bunch of sites that you really like and it's sort of a
daily routine for you to go thru them. They may be a news site, your
friend's blog, but also craigslist bcause you're currently looking for
a new house and maybe a weather site to know how late you should stay
at work :)
The first thing you do when you get to work, is open your web browser
and these sites in new tabs. It's not particularly cumbersome because
there are just 4 sites. But think about it: maybe there is a new blog
that you start to like and ho, these cartoons are really funny. Maybe
there is also a bit of financial info that you're interested in and
the pictures that your brother is posting to Flickr every couple day:
they just had a new baby! Also, as you're trying to buy a house, you'd
love a little raise and you've figured that your boss really likes it
when you tell her that you've read about your company in the news or
when you tell her about a new competing product... There is also
StackOverflow. You're desperately trying to get this "expert" badge
and boost up your reputation: this may help with your boss too or even
when you're looking for a new job.
Opening all these tabs is starting to take a toll and you keep
forgetting an important one. You're also slowly getting tired of the
different reading experience that all these sites have: small fonts,
large fonts, ads all over...etc. Now you have a problem.
Imagine there is a tool that does the following: you can tell it what sites you care about, and then, this tool will look up the new stuff for you. It will show everything in a nice looking format. It should also help you identify what's really worth seeing ASAP or maybe have some kind of "serendipity" mode that you can go into and find interesting stuff that you would have missed otherwise. The tool will obviously send you to the original sites should you need more info about any particular story or classified...
This tool exists. It's usually called a Reader, mostly because it lets your read more things online. Often times you'll see them called "RSS reader", because RSS is what they use to get the information from all these sites. RSS is the pipe. You as a user should probably not know about it, but that's what the readers depend on. In an ideal world, when you're on site you like, you should just hit "follow" on a button like this one and then you'd be redirected to your reader of choice. Later when new content is added, you'll get it straight in your reader.
To get a bit into more technical details, RSS (like Atom) is an XML flavor. It's a collection (mostly reverse chronological) of entries. Entries have at least a title and a link to the actual story. They should also include a unique identifier and could have other elements like a description, an image, tags, author information... etc.
RSS is great because it's content agnostic. It can be used to represent a lot of different things (as described in the little story) and decouples the publishing platform from the subscribing platform: they don't even know the other one exists. RSS is their lingua-franca.
I wrote a blog post about this very question not long ago. Here's the link if you're interested in reading my personal interpretation. https://www.rss.com/whatisrss
An XML file is all the content of a page, with no markup. The XML represents the data in its rawest, most descriptive form. Many readers can interpret XML sources from a variety of places, and format all of the data in its own unique way.

data collection for statistics: from web to a database

I'm a statistician by trade and I'd like some recommendations on how to set up a website that can collect data into a database. For personal use, I use Google Forms to collect data, and everything gets populated into a spreadsheet. However, this may not be appropriate in a more professional setting, especially when we have multiple pages/forms. I imagine two uses:
A website where I can send the link to others so they can fill out, similar to Google Forms.
A website where only authorized users can log in to fill out data. Think of a setting where patients are followed periodically in a research study. It'd be cool to have the clinician enter the data directly into the database as he/she fills out the forms as opposed to having another data analyst transcribe his written forms into the database.
The obvious solution would be to hire a web developer. However, I like doing things myself when they are manageable. I imagine a web developer would have to know html, php, and database knowledge (eg, MySQL or PostgreSQL). My experience in these are limited to setting up a wordpress blog on my linux server. My experience with html is also limited as I use emacs org-mode to generate them from plain text. I hope to hear about solutions with a minimal learning curve. My preference of course would be free open source software and Linux-based, but I'd like to hear all available solutions (our data manager is a Windows user).
I recently read a post on Linux Journal that mentions REDCap, but it seems you have to get institutional permission to use.
I also tagged "R" on this post as I'd like to hear what R users are doing about data collection. I'll ultimately analyze the data with R, but all data analysis begins with the scientific question and data collection.
Thanks!
UPDATE 10/4/2010: Thanks everyone for the responses so far. It appears most of the third-party solutions proposed so far has data housed in a database hosted by the vendor. I'd like to house all data in our SQL Server. That is, data entry from the web enters the database in real time, ready for data analysis.
Maybe the limesurvey.org project is of interest ...
It sounds to me like you've got yourself a med study. There are a plethora of concerns that come to mind just from what you've described you want to do. Not the least of which is privacy. Where is it going to be hosted? Have you received consent from the patients to be collecting and transmitting their information electronically? What data are you storing, if any, that could combine to present their identity.
Personally, I steer clear of DIY online data collection tools. I pay a firm, like Ipsos, Research Now/E-Rewards, to program and manage data collection using questionnaires that I have designed. The reason is, knowing how to design research and analyze data is one thing. But if you've been trained in statistics - I can safely argue that you "don't know shit" about data collection. Sure you may know a bunch about sampling theory, but when it comes to getting data in - it's best to leave it to the pros.
There are a number of "industrial quality" online data collection tools available.
Confirmit (Pretty much the gold standard for online data collection)
DASH (Smaller following, but incredibly flexible)
There are also purely web based solutions, some of which are free (not that I would recommend using them)
QuestionPro
SurveyMonkey
Zoomerang
Although, unless you're doing a study with over 50 patients, I would just recommend getting the Physicians or their assistants to fill out Excel sheets and send them to your co.
Also, it's unlikely that you'll need to set-up a username/password system. What you want is referred to as an "open-link". Where respondents click a link and enter information, identifier info can be added by the respondent. You don't need a password because people can only INPUT information, not read it.
Most of the systems I mentioned above work on the idea of emailing a respondent (a clinician) with a link to a web based survey. Which could be easily adapted to your specific needs and act as a reminder to the clinician to fill out the form.
If your question types are simple. I'm sure you could hire a programmer to put together a website that has the forms you need behind an authorized front end. PHP/MySQL would likely do the trick. But, I would double check the privacy laws in your jurisdiction surrounding medical research before going ahead.
I have conducted medial research using an online form (actually two of them). My questions were quite discrete and particular to the disease I was researching.
Previously in a related project, I had created two or three page questionnaires which were printed and then subjects and surgeons filled out the forms and our research coordinator would enter them into our database. It was a lot of work with lots of room for error. I did not like it. Online forms were much better.
I used SurveyGizmo and was happy with it. I looked at lots of options about two years ago. Google Forms did not exist at that time. I went with SurveryGizmo primarily because they had a a statement (attestation) that they were compliant with HIPAA. I could not ensure security such as ssl connections with the other websites. However in order to get myself into that capability (https connections) I had to buy the enterprise level eventhough on every other capability I could have used the free service. Also SurveyGizmo offered a 50% reduction for non-profits which our research institute qualified for.
SurveryGizmo was easy to design and put into production without having to program myself. It was easy to download the data in csv format and read that straight into R. Although I had some weird issues that I needed help with. I had to use the "old" format for export so that it came as a straigtforward csv. Furthermore, the csv file had the odd feature of the the first TWO rows being header rows. But I solved that problem with the help of stackoverflow.
SurveryGizmo has fantastic logic and piping that enbabled me to only ask relevant questions and thereby not waste the time of my respondents and even more importantly, there were no irrelevant questions to confuse respondents.
Finally, I was able to use SurveyGizmo in such a way that I could also track our (research staff) fulfillment and logistics. For instance we got notification when there were new potential subjects who were interested in participating. We were able to note FedEx tracking numbers along with the records of the corresponding subjects.
Basically it worked well.
The safest platform for collecting confidential survey data is Confirmit. There is a learning curve involved here- you will be coding in VisualSQL, which is only used in Confirmit. The survey responses will export to csv files, where you can analyze your results in R.
If you are collecting any confidential data, or data where respondents need unique access links so they can only see their own version of the survey, you will want to use Confirmit. The data is housed in Confirmit's data center, but their data is much more secure than other vendors (i.e., a third party will not be able to hack into your survey and see an individual's responses, or intercept the data that is being sent from your respondent to Confirmit).

Resources