How to give synonyms to an employee name lookup table? - lookup-tables

I have created a lookup table of employee name text file referring to the rasa blog(link below).
Improving entity extractions with Rasa
Now my use case also requires me to give synonyms to these employees in the lookup table. For example, “Nicholas” can also be referred to as “Nick” or “Nic”, so that the rasa bot can extract “nick” as “nicholas” and fulfill the use case.
Please advice how to achieve this.
Thanks

Lookup and synonyms have a different purpose as while, lookups are used for entity extraction, synonyms are used as a filtration method to change the format of any synonyms to original text. Therefore, I think, you can't have synonyms within the lookup table so you might have to do that separately.
However, If you have a long list of synonyms you can use a file path instead of list.
## synonym:Nick
data/path/nick.txt
I had a similar situation with City names and their nick while I was using City name from the lookup but placed their synonyms in the main data file as
## synonym:New York City
- NY
- NYC
- New York
## lookup:city
data/lookups/city_lookup.txt
I recommend using https://github.com/rodrigopivi/Chatito which will really ease the task for you as it has a really good mapping system that does the work for you with regards to synonyms and lookups.

Related

Adding a new user to neo4j

A totally neo4j noob is talking here,
I like to create a graph to store a set of users, a typical user is as follows:
CREATE
(node_1 {FullName:"Peter Parker",FirstName:"peter",FamilyName:"parker"}),
(node_2 {Address:"Newyork",CountryCode:"US"}),
(node_3 {Location:"Hidden"}),
(node_4 {phoneNumber:11111}),
(node_5 {InternetEmailAddress:"peter#peterland.com")
now the problem is,
Every time I execute this I add 5 more nodes.
I know I need to use a unique key, but all example I saw can use a unique key for a specific node. So how can I make sure a user doesn't get added if it already exists(I can use email address as unique key).
how do I update the nodes if some changes occur. for example, after a week I want to update the graph to contain the following instead of the previous one.(no duplicates)
CREATE(node_1 {FullName:"Peter Parker",FirstName:"peter",FamilyName:"parker"}),(node_2 {Address:"Newyork",CountryCode:"US"}),(node_3 {Location:"public"}),(node_4 {phoneNumber:11111}),(node_5 {InternetEmailAddress:"peter#peterland.com"),(node_6 {status:"Jailed"})
(NOTE the new update changed location to "public" and added a new node for peter
Seeing as you had a load of nodes anyway.
Some of the data you have modelled as Nodes are probably properties as the other answer suggests, some are possibly correctly modelled as Nodes and one could probably form the or a part of the relationship.
Location public/hidden can be modelled in one of three ways, as a property on the Person, as a property between the Person and the Location or as the relationship type. To understand that first you need to have a relationship.
Your address at the moment is another Node, I think this is correct, but possibly you would want two nodes, related something like this:
(s:State)-[:IN_COUNTRY]-(c:Country)
YMMV and clearly that a US centric model, but you can extend it easilly enough.
Now you could create Peter with a LIVES_IN relationship:
CREATE (p:Person{fullName:"Peter Parker"}), (s:State{name:"New York"}), (c:Country{code:"US"}),
(p)-[:LIVES_IN]->(s), (s)-[:IN_COUNTRY]->(c)
For speed you are better off modelling two relationships which could be LIVES_IN_PUBLIC and LIVES_IN_HIDDEN which means to perform that update that you want above then you have to delete the one and create the other. However, if speed is not of the essence, it is common also to use properties on the relationship.
CREATE (p:Person{fullName:"Peter Parker"}), (s:State{name:"New York"}), (c:Country{code:"US"}),
(p)-[:LIVES_IN{public:false}]->(s), (s)-[:IN_COUNTRY]->(c)
So your complete Q&A:
CREATE (p:Person {fullName:"Peter Parker",firstName:"peter",familyName:"parker", phoneNumber:1111, internetEmailAddress:"peter#peterland.com"}),
(s:State {name:"New York"}), (c:Country {code:"US"}),
(p)-[:LIVES_IN{public:false}]->(s), (s)-[:IN_COUNTRY]-(c)
MATCH (p:Person {internetEmailAddress:"peter#peterland.com"})-[li:LIVES_IN]->()
SET li.public = true, p.status = "jailed"
When adding other People you probably do not want to recreate States and Countries, rather you want to match them, and possibly Merge them, but we'll stick to Create.
MATCH (s:State{name:"New York"})
CREATE (p:Person{name:"John Smith", internetEmailAddress:"john#google.com"})-[:LIVES_IN{public:false}]->(s)
John Smith now implicitly lives in the US too as you can follow the relationship through the State Node.
Treatise complete.
I think you're modeling your data incorrectly here - you're setting up each property of the person as a separate node, which is not a good idea. You don't have any linkages between those nodes, so with this data pattern, later on you won't be able to tell what Peter Parker's address is. You're also not using node labels, which I think could really help here.
The quick question to your answer about updating nodes is that you have to MATCH them, then use SET to modify a property. So if you had a person, you might do this:
MATCH (p:Person { FullName: "Peter Parker" })
SET p.Address = "123 Fake Street"
RETURN p;
But notice I'm making assumptions about the way your data is structured. I'll take that same data you provided, this might be a better way of creating it:
CREATE (node_1:Person {FullName:"Peter Parker",
FirstName:"peter",
FamilyName:"parker",
Address:"Newyork",CountryCode:"US",
Location:"Hidden",
phoneNumber:11111,
InternetEmailAddress:"peter#peterland.com"});
The difference with this suggestion is that I'm putting all the properties into a single node (instead of one property per node) and I'm applying the Person label to the node.
If you structured the data like this, then the update query I provided would work. Structuring the data like you have it, it's not possible to update Peter Parker's address, because there's no relationship between your node_1 and node_2

Enter data in mother table using data from child tables

Hi all,
I have 3 tables in an access 2010 database:
Crew: CrewID; Name; Adres;...
Voyage: VoyageId; Voyage name; Departure harbour; Arrival harbour
Crewlist: CrewlistId, VoaygeId, CrewId, Rank
The VoaygeId and CrewId from the Crewlist table are linked (relation) to the autonumber ID's from tables 2 and 1.
My first and main question is: Upon boarding everyone has to ‘sign in’ selecting the voyage and there name, and assign them a roll (of to be donde by the responsible officer). How can I make a form that lets the users browse through the voyagenames and crewnames in stead of the ID’s uses in the ‘mother’ table (table 3: Crewlist)
2nd question: how can I make sure that someone isn’t enrolled twice for the same voyage (adding same voyagenumber and same crewId number in crewlist). This would preferably be blocked upon trying to add the same person a second time on a voyage.
To prevent duplicates in Crewlist, add a unique index to the table on both CrewId and VoyageId
It would be a good idea to add relationships and enforce referential integrity
You are now in a position to use the wizards to create a form based on Voyage and a subform based on CrewList with a combobox based on Crew
There are a number of refinements you could add.
Make sure you do not use reserved words like Name and do not put spaces in field names. You will thank yourself later.
See also create form to add records in multiple tables

Get peoples surnames or christian names from freebase?

Is it possible to get a users first name or surname from a freebase query?
For example, I have a person entry I have the id of, but I just want to extract their first name.
{
"id": "/en/paul_thomas_anderson",
"name" : null
}​
How would I modify this query, its something I've found nothing about by googling or searching here on S.O.? I know this kind of thing is possible in dbpedia for most people entries.
No, it's not possible directly. The name is stored as a single unit. There are topics for given names and surnames (e.g. http://www.freebase.com/view/base/givennames/given_name), so you could split the name and see which list(s) it appears in, but that's indirect and doesn't tell you about the specific person you are querying.

Include synonyms in solr without using synonyms.txt

I am using Drupal Apache Solr for my searches. in this I found a synonyms.text file in which you can include synonyms manually for the words u want.
But as I suppose it would be very hard to include synonyms manually for each word as my application has large data.
What I want to achieve is as following in my search results:
when the user will search for allu in place of potato, we will display potato as 1st result.
Another example: when user will search for 'raw apple' then we'll display 'apple' as 1st record because 'raw apple' is synonym of 'apple'.
But the problem is 100K records and each record has 4-5 synonyms. Entering them manually is not possible.
Another issue is If I want to make changes to synonyms of particular record I will have to do it manually which is time consuming as well.
I want to know is there any other option so that I need not to enter synonyms manually?
IMO this is close to search engine optimization. Also you may have a tough time managing the synonyms manually.
Follow what Indian e-retail sites are doing to accomodate synonyms. For example e-retail stores have adapted by renaming a certain product belly shoes as shoppers tend to mispronounce and misspell "ballet". They wouldnt have anticipated it before users actually searched for them.
So log all requests which return few results (and otherwise dissatisfy customers). Maintain a list of synonyms in the index. And include these synonyms in the keywords when adding a new product: when adding a product x y z, automatically fetch all synonyms to x, y and z and let your data entry guys choose from them.
'type':'synonym'
'terms':'ballet','belly'
'type':'synonym'
'terms':'potato','allu','aloo'
'type':'product'
'name':'home garden potato planter'
'keywords':'allu','aloo'
'type':'product'
'name':'aloo mutter fry mix'
'keywords':'potato','allu','cheese'
we can maintain a list of synonyms in the index. and include these synonyms in the keywords by adding a new product. when adding a new product a b c it can fetch synonyms to a, b and c.
'type' :product'
'name' :'monety carlo shirt for men'
'keywords' : 'montey carlo', 'shirts'
Example: Online Shopping Store Has adapted to rename certain products and misspell name.

Allow users to create new categories and fields on ASP.NET website

We have a db driven asp.net /sql server website and would like to investigate how we can allow users to create a new database category and fields - is this crazy?. Is there any examples of such organic websites out there - the fact that I havent seen any maybe suggest i am?
Interested in the best approach which would allow some level of control by Admin.
I've implemented things along these lines with a dictionary table, rather than a more traditional table.
The dictionary table might look something like this:
create table tblDictionary
(id uniqueidentifier, --Surrogate Key (PK)
itemid uniqueidentifier, --Think PK in a traditional database
colmn uniqueidentifier, --Think "column name" in a traditional database
value nvarchar, --Can hold either string or number
sortby integer) --Sorting columns may or may not be needed.
So, then, what would have been one row in a traditional table would become multiple rows:
Traditional Way (of course I'm not making up GUIDs):
ID Type Make Model Year Color
1 Car Ford Festiva 2010 Lime
...would become multiple rows in the dictionary:
ID ITEMID COLUMN VALUE
0 1 Type Car
1 1 CarMake Ford
2 1 CarModel Festiva
3 1 CarYear 2010
4 1 CarColor Lime
Your GUI can search for all records where itemid=1 and get all of the columns it needs.
Or it can search for all records where itemid in (select itemid from tblDictionary where column='Type' and value='Car' to get all columns for all cars.
In theory, you can put the user-defined types into the same table (Type='Type') as well as the user-defined columns that that Type has (Type='Column', Column='ColumnName'). This is where the sortby column comes into it - to help build the the GUI in the correct order, if you don't want to rely on something else.
A number of times, though, I have felt that storing the user-defined dictionary elements in the dictionary was a bit too much drinking-the-kool-aid. Those can be separate tables because you already know what structure they need at design time. :)
This method will never have the speed or quality of reporting that a traditional table would have. Those generally require the developer to have pre-knowledge of the structures. But if the requirement is flexibility, this can do the job.
Often enough, what starts out as a user-defined area of my sites has had a later project to normalize the data for reporting, etc. But this allows users to get started in a limited way and work out their requirements before engaging the developers.
After all that, I just want to mention a few more options which may or may not work for you:
If you have SharePoint, users already have the ability to create
their own lists in this way.
Excel documents in a shared folder that are saved in such a way
to allow multiple simultaneous edits would also serve the purpose.
Excel documents, stored on the webserver and accessed via ODBC
would also serve as single-table databases like this.

Resources