We are using Weaviate to serve e-commerce results.
Our Weaviate database stores all the products we sell.
Based on the customer and the search term we create a vector and use this to query the database. This property is called search_engine_query_vector.
For example if a customer has a habit of buying expensive products and searches for "TV" the system will most likely create a vector which is "closer" to the more expensive TVs in the database. So their first page of results is the most expensive TVs.
While this works well 99% of the time we also want ppl to be able to sort based on price.
For this we will send a query to Weaviate, where we only return products which are close to our vector(It is assumed this is all the TVs). like below:
client.query.get("Product", ["sku", "responseBody", "_additional { certainty }",
"stores { ...on Store {storeId salesPrice additionalResponseBody}}"]).with_near_vector(
{"vector": search_engine_query_vector, "similarity": TV_CUTOFF})
.limit(10)
.sort_base_on_price()
My question is there functionality in the api analogous to sort_base_on_price?
you can assume price is a number field in the schema.
Great to hear you're working with Weaviate for an e-commerce solution.
Weaviate added an initial version of sorting functionality in version 1.13!
With the Weaviate version 1.13.0 we also released the python-client v3.5.0 that introduced this functionality too. You can find the needed method documentation for python here or for other clients here!
For your use case, you could also try the following GraphQL query in the Weaviate Console:
{
Get {
Product(nearVector :{
vector:[search_engine_query_vector],
certainty: TV_CUTOFF
}
sort:{
order:desc,
path:["price_field"]
}
limit: 10
) {
sku
responseBody
stores { ...on Store {
storeId
salesPrice
additionalResponseBody
}
_additional {
certainty
}
}
}
}
Related
I want to check if string s is contained in any wikidata item's label, altLabel or description and if so, return all of them. The sheer number of Wikidata items prohibits the use of SPARQL, because it will reach a timeout, so I need to do it locally. I did the same for properties before by performing this query and parsing the result locally:
SELECT ?property ?propertyLabel ?propertyDescription (GROUP_CONCAT(DISTINCT(?altLabel); separator = ", ") AS ?altLabel_list) WHERE {
?property a wikibase:Property .
OPTIONAL { ?property skos:altLabel ?altLabel . FILTER (lang(?altLabel) = "en") }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .}
}
GROUP BY ?property ?propertyLabel ?propertyDescription
It produces a table that looks similar to this "official" one on wikidata.
What is a space- (and ideally time-)efficient way of obtaining a list/table of all wikidata items with labels, descriptions and altLabels just like the one above? Namely, can I somehow avoid downloading the whole Wikidata dump, parsing it and building the list myself with standard hardware?
I found this tool, but am not sure if is capable of doing what I need. I do not want to waste community resources either.
The wdumps tool works and would seem to be the closest to what you're asking for, i. e. a complete list. If you look at the list of recent runs of the tool, you may find what you need anyway, because it's a common ask.
Aside from working with the whole list locally, the documentation recommends the SPARQL interface to wikipedia's "regular" search engine, like this:
SELECT ?item ?label
WHERE
{
SERVICE wikibase:mwapi
{
bd:serviceParam wikibase:endpoint "www.wikidata.org";
wikibase:api "Generator";
mwapi:generator "search";
mwapi:gsrsearch "inlabel:Frankfurt";
mwapi:gsrlimit "max".
?item wikibase:apiOutputItem mwapi:title.
}
?item rdfs:label ?label.
FILTER CONTAINS(?label, "Frankfurt")
}
And, as a third possibility, I want to mention the interface at https://query.wikidata.org/bigdata/ldf. This is a little-known API for the data. Judging by its speed and from its documentation, it is very efficient. But, as the linked example query shows, there are half a billion labels, so even a fast method of access such as this will be a challenge.
My application use keywords extensively, everything is tagged with keywords, so whenever use wants to search data or add data I have to show keywords in auto complete box.
As of now I am storing keywords in another collection as below
export interface IKeyword {
Id:string;
Name:string;
CreatedBy:IUserMin;
CreatedOn:firestore.Timestamp;
}
export interface IUserMin {
UserId:string;
DisplayName:string;
}
export interface IKeywordMin {
Id:string;
Name:string;
}
My main document holds array of Keywords
export interface MainDocument{
Field1:string;
Field2:string;
........
other fields
........
Keywords:IKeywordMin[];
}
But problem is auto complete reads data frequently and my document reads quota increases very fast.
Is there a way to implement this without increasing reads for keyword ? Because keyword is not the real data we need to get.
Below is my query to get main documents
query = query.where("Keywords", "array-contains-any", keywords)
I use below query to get keywords in auto complete text box
query = query.orderBy("Name").startAt(searchTerm).endAt(searchTerm+ '\uf8ff').limit(20)
this query run many times when user types auto complete search which is causing more document reads
Does this answer your question
https://fireship.io/lessons/typeahead-autocomplete-with-firestore/
Though the receommended solution is to use 3rd party tool
https://firebase.google.com/docs/firestore/solutions/search
To reduce documents read:
A solution that come to my mind however I'm not sure if it's suitable for your use case is using Firestore caching feature. By default, firestore client will always try to reach the server to get the new changes on your documents and if it cannot reach the server, it will reach to the cached data on the client device. you can take advantage of this feature by using the cache first and reach the server only when you want. For web application, this feature is disabled by default and you can enable it like in
https://firebase.google.com/docs/firestore/manage-data/enable-offline
to help you understand this feature more check this article:
https://firebase.google.com/docs/firestore/manage-data/enable-offline
I found a solution, thought I would share here
Create a new collection named typeaheads in below format
export interface ITypeAHead {
Prefix:string;
CollectionName:string;
FieldName:string;
MatchingValues:ILookupItem[]
}
export interface ILookupItem {
Key:string;
Value:string;
}
depending on the minimum letters add either 2 or 3 letters to Prefix, and search based on the prefix, collection and field. so most probably you will end up with 2 or 3 document reads for on search.
Hope this helps someone else.
I am using firebase for data storage. The data structure is like this:
products:{
product1:{
name:"chocolate",
}
product2:{
name:"chochocho",
}
}
I want to perform an auto complete operation for this data, and normally i write the query like this:
"select name from PRODUCTS where productname LIKE '%" + keyword + "%'";
So, for my situation, for example, if user types "cho", i need to bring both "chocolate" and "chochocho" as result. I thought about bringing all data under "products" block, and then do the query at the client, but this may need a lot of memory for a big database. So, how can i perform sql LIKE operation?
Thanks
Update: With the release of Cloud Functions for Firebase, there's another elegant way to do this as well by linking Firebase to Algolia via Functions. The tradeoff here is that the Functions/Algolia is pretty much zero maintenance, but probably at increased cost over roll-your-own in Node.
There are no content searches in Firebase at present. Many of the more common search scenarios, such as searching by attribute will be baked into Firebase as the API continues to expand.
In the meantime, it's certainly possible to grow your own. However, searching is a vast topic (think creating a real-time data store vast), greatly underestimated, and a critical feature of your application--not one you want to ad hoc or even depend on someone like Firebase to provide on your behalf. So it's typically simpler to employ a scalable third party tool to handle indexing, searching, tag/pattern matching, fuzzy logic, weighted rankings, et al.
The Firebase blog features a blog post on indexing with ElasticSearch which outlines a straightforward approach to integrating a quick, but extremely powerful, search engine into your Firebase backend.
Essentially, it's done in two steps. Monitor the data and index it:
var Firebase = require('firebase');
var ElasticClient = require('elasticsearchclient')
// initialize our ElasticSearch API
var client = new ElasticClient({ host: 'localhost', port: 9200 });
// listen for changes to Firebase data
var fb = new Firebase('<INSTANCE>.firebaseio.com/widgets');
fb.on('child_added', createOrUpdateIndex);
fb.on('child_changed', createOrUpdateIndex);
fb.on('child_removed', removeIndex);
function createOrUpdateIndex(snap) {
client.index(this.index, this.type, snap.val(), snap.name())
.on('data', function(data) { console.log('indexed ', snap.name()); })
.on('error', function(err) { /* handle errors */ });
}
function removeIndex(snap) {
client.deleteDocument(this.index, this.type, snap.name(), function(error, data) {
if( error ) console.error('failed to delete', snap.name(), error);
else console.log('deleted', snap.name());
});
}
Query the index when you want to do a search:
<script src="elastic.min.js"></script>
<script src="elastic-jquery-client.min.js"></script>
<script>
ejs.client = ejs.jQueryClient('http://localhost:9200');
client.search({
index: 'firebase',
type: 'widget',
body: ejs.Request().query(ejs.MatchQuery('title', 'foo'))
}, function (error, response) {
// handle response
});
</script>
There's an example, and a third party lib to simplify integration, here.
I believe you can do :
admin
.database()
.ref('/vals')
.orderByChild('name')
.startAt('cho')
.endAt("cho\uf8ff")
.once('value')
.then(c => res.send(c.val()));
this will find vals whose name are starting with cho.
source
The elastic search solution basically binds to add set del and offers a get by wich you can accomplish text searches.
It then saves the contents in mongodb.
While I love and reccomand elastic search for the maturity of the project, the same can be done without another server, using only the firebase database.
That's what I mean:
(https://github.com/metaschema/oxyzen)
for the indexing part basically the function:
JSON stringifies a document.
removes all the property names and JSON to leave only the data
(regex).
removes all xml tags (therefore also html) and attributes (remember
old guidance, "data should not be in xml attributes") to leave only
the pure text if xml or html was present.
removes all special chars and substitute with space (regex)
substitutes all instances of multiple spaces with one space (regex)
splits to spaces and cycles:
for each word adds refs to the document in some index structure in
your db tha basically contains childs named with words with childs
named with an escaped version of "ref/inthedatabase/dockey"
then inserts the document as a normal firebase application would do
in the oxyzen implementation, subsequent updates of the document ACTUALLY reads the index and updates it, removing the words that don't match anymore, and adding the new ones.
subsequent searches of words can directly find documents in the words child. multiple words searches are implemented using hits
SQL"LIKE" operation on firebase is possible
let node = await db.ref('yourPath').orderByChild('yourKey').startAt('!').endAt('SUBSTRING\uf8ff').once('value');
This query work for me, it look like the below statement in MySQL
select * from StoreAds where University Like %ps%;
query = database.getReference().child("StoreAds").orderByChild("University").startAt("ps").endAt("\uf8ff");
Tryng to get a simple result using "Where" style in firebase but get null althe time, anyone can help with that?
http://jsfiddle.net/vQEmt/68/
new Firebase("https://examples-sql-queries.firebaseio.com/messages")
.startAt('Inigo Montoya')
.endAt('Inigo Montoya')
.once('value', show);
function show(snap) {
$('pre').text(JSON.stringify(snap.val(), null, 2));
}
Looking at the applicable records, I see that the .priority is set to the timestamp, not the username.
Thus, you can't startAt/endAt the user's name as you've attempted here. Those are only applicable to the .priority field. These capabilities will be expanding significantly over the next year, as enhancements to the Firebase API continue to roll out.
For now, your best option for arbitrary search of fields is use a search engine. It's wicked-easy to spin one up and have the full power of a search engine at your fingertips, rather than mucking with glacial SQL-esque queries. It looks like you've already stumbled on the appropriate blog posts for that topic.
You can, of course, use an index which lists users by name and stores the keys of all their post ids. And, considering this is a very small data set--less than 100k--could even just grab the whole thing and search it on the client (larger data sets could use endAt/startAt/limit to grab a recent subset of messages):
new Firebase("https://examples-sql-queries.firebaseio.com/messages").once('value', function(snapshot) {
var messages = [];
snapshot.forEach(function(ss) {
if( ss.val().name === "Inigo Montoya" ) {
messages.push(ss.val());
}
});
console.log(messages);
});
Also see: Database-style queries with Firebase
I am trying to use the freebase search API to obtain the parties of specific politicians as well as general biographical data (using php). I know that the search api passes the id of each search result found to an MQL query specified by the mql_output parameter.
Here is the MQL query I have at the moment which I use for the mql_ouput parameter
{
"name":null,
"/people/person/date_of_birth":null,
"/people/person/gender":null,
"/wikipedia/topic/en_id":null,
"id":null,
"/government/politician/party":[
{
party : null
}
]
}
and this is the resultant query url
https://www.googleapis.com/freebase/v1/search?Barrack%20Obama&filter=%28all%20type%3Apolitician%29&mql_output=%7B%22name%22%3Anull%2C%22%5C%2Fpeople%5C%2Fperson%5C%2Fdate_of_birth%22%3Anull%2C%22%5C%2Fpeople%5C%2Fperson%5C%2Fgender%22%3Anull%2C%22%5C%2Fwikipedia%5C%2Ftopic%5C%2Fen_id%22%3Anull%2C%22id%22%3Anull%2C%22%5C%2Fgovernment%5C%2Fpolitician%5C%2Fparty%22%3A%5B%7B%7D%5D%7D&key=AIzaSyDdJ_9L6mcWXinx5Lehku2TULmJhOMESew&indent=true
Thanks for your help, sorry that it's quite basic question,
Mark
Answer
Could not self answer due to not enough forum rep.
I just realised what I needed to do after going over some more examples. To have the database retrieve the party information I needed to use the following query.
{
"name":null,
"/people/person/date_of_birth":null,
"/people/person/gender":null,
"/wikipedia/topic/en_id":null,
"id":null,
"/government/politician/party":{
"party" : null
}
}
Mark
I just realised what I needed to do after going over some more examples. To have the database retrieve the party information I needed to use the following query.
{ "name":null, "/people/person/date_of_birth":null, "/people/person/gender":null, "/wikipedia/topic/en_id":null, "id":null, "/government/politician/party":{ "party" : null } }
Mark