Dictionary structure in firebase - firebase

I want to write a program using a google Realtime DB firebase, The idea is similar to a dictionary, where one word is connected to many other words, and if you search for any of these words it will display all connected words.
I need to know how I can structure my data so I can implement my Idea, To be honest as starting I have no Idea where to start, Any idea, Links can help to start?
**------ UPDATE FIRST IDEA ---**
I am thinking to make a root for each word, and when searching for a word I can find all alternative/Synonyms words that connected to the root. Example:
Assume the root word is FIND with ID 1
and there is 3 alternative/Synonyms word SEEK,SPOT,SEARCH
The Data base will like
ID word rootID
1 FIND 0 -- 0 means its a root word
2 SEEK 1 -- root word is FIND
3 SEARCH 1
4 SPOT 1
Now if I am searching for SPOT, then I know that FIND is the root, now I can find all words that has root ID 1 and Synonyms to word FIND

Related

How to Use If else in Neo4j Cypher or using RNeo4j?

My Neo4j Database has 5 different types of nodes and total 120k nodes.
There are very few cases where all 5 types of node are connected through relationships.
For example, (A)-->(B)-->(C)-->(D)-->(E).
In that case I want to return this path of 4 length depending on id search, else return any path of 4 that exist else any path of 3 so on.
Currently, I am sending 5 path cypher, 4 path cypher, so on from R program to Neo4j, which is expensive.
Is there any easiest way to do this in single Cypher.?
There are many question discussion similar to this which suggest use of CASES, FOR EACH, APOC. But nothing seems to work for me

How to add customized tokens into solr to change the indexing token behaviour

It's a Drupal site with solr for search. Mainly I am not satisfied with current search result on Chinese. The tokenizer has broken the words into supposed small pieces. Most of them are reasonable. But still, it made mistakes by not treating something as a valid token either breaking it to pieces or not breaking it.
Assuming I am writing Chinese now: big data analysis is one word which shouldn't be broken. So my search on it should find it. Also I want people to find AI and big data analysis training as the first hit when they search the exact phrase AI and big data analysis training.
So I want a way to intervene or compensate the current tokens to make the search smarter.
Maybe there is a file in solr allow me to manually write these tokens down to relate them certain phrases? So every time when indexing, solr can use it as a reference.
You different steps to achieve what you want :
1) I don't see an extremely big problem with your " over tokenization" :
big data analysis is one word which shouldn't be broken. So my search on it should find it. -> your search will find it even if tokenized, I understand this was an example and the actual words are chinese, but I suspect a different issue there
2) You can use the edismax[1] query parser with phrase boost at various level to boost subsequent tokens or phrases ( pf,pf2,pf3...ps,ps2,ps3...)
[1] https://lucene.apache.org/solr/guide/6_6/the-extended-dismax-query-parser.html , https://lucene.apache.org/solr/guide/6_6/the-extended-dismax-query-parser.html#TheExtendedDisMaxQueryParser-ThepsParameter

Trying to create a list of locations without any duplicates

So I've got a table that is used largely for inventory purposes. There's a location, part number, length (a single part can have multiple lengths), user, etc..
How the system is supposed to work is one person scans the parts and lengths, once it's done a second and third person come and scan the parts in succession.
I'm trying to create a list of locations in which no one part/length combination in any location has got multiple scans. So if any part/length combination has been scanned more than once than that entire location is thrown out and not in the final list.
I've been racking my brain and this seems like a simple query but I can't seem to find something that works.

Two steps search to search document with similar vectors in Solr

I am thinking to find documents in Solr that have similar vectors.
A user enters a few keywords
A list of documents that have the keywords will be reported by Solr based on Solr's scoring alogrithms.
The user then select a couple of documents as the reference documents.
Solr will then search for documents that have close correlation (similar vectors) to the selected couple of documents.
For the first 3 steps, I know how to do it. But have no clue how to perform step 4. I have read [https://cwiki.apache.org/confluence/display/solr/The+Term+Vector+Component][1], but still not sure how to perform step 4.
I can think of two approaches. The first is to use search results clustering. You first search by the keywords then ask solr to cluster the results. Present to the user the list of clusters and thier documents.
The second approach is to use multiple requests of the more like this handler and merge the results. In each request, you use a document from the reference documents that the user has marked.
The step 4 sounds like a More Like This function, which already ships with Solr.

Storing variable data

I am building an application in ASP.NET, C#, MVC3 and SQL Server 2008.
A form is presented to a user to fill out (name, email, address, etc).
I would like to allow the admin of the application to add extra, dynamic questions to this form.
The amount of extra questions and the type of data returned will vary.
For instance, the admin could add 0, 1 or more of the following types of questions:
Have you a full, clean driving liscence?
Rate your drivings skills from 1 to 5.
Describe the last time you went on a long journey?
etc ...
Note, that the answers provided could be binary (Q.1), integer (Q.2) or free text (Q.3).
What is the best way of storing random data like this in MS SQL?
Any help would be greatly appriecated.
Thanks in advance.
I would create a table with the following columns and store the name of the variable along with value in the appropriate column with all other values null.
id:int (primary)
name:varchar(100)
value_bool:bit(nullable)
value_int:int (nullable)
value_text:varchar(100) (nullable)
Unless space is an issue, I would use VARCHAR(MAX). It gives you up to 8,000 characters and stores numbers and text.
edit: Actually as Aaron points out below, that will give you 2 billion characters (enough for a book). You might go with VARCHAR(8000) or the like then, wich does give you up to 8,000 characters. Since it is VARCHAR, it will not take empty space (so a 0 or 1 will not take up 8,000 characters worth of space, only 1).

Resources