elasticsearch fuzziness setting in Meteorjs - meteor

I use ElasticSearch for my Meteor app. elasticsearch-river-mongodb hooks ES and MongoDB together.
The regular full-text search works as expected via below code:
var result = Meteor.http.get(url, {
params: {
q: q,
fields: '_id,feed_category,summary,title,description,author',
from: from, // offset into results (defaults to 0)
size: size // number of results to return (defaults to 10)
}
});
I'd like to support fuzzy search, I tried:
var result = Meteor.http.get(url, {
params: {
fuzzy: q,
fields: '_id,feed_category,summary,title,description,author',
from: from, // offset into results (defaults to 0)
size: size // number of results to return (defaults to 10)
}
});
It doesn't return the correct result, and it always has 36 hits.
I then tried:
var result = Meteor.http.get(url, {
params: {
"fuzzy_like_this" : {
"fields" : ["_id", "summary", "title", "description", "author"],
"like_text" : q,
"max_query_terms" : 12
},
from: from,
size:size
}
});
Same as above, doesn't return correct result, and always has 36 hits.
How do I configure the parameters for ES fuzziness in Meteor?

Related

Kibana Server-Side Plugin in 7.16.3, Response.Body is always empty and fails validation step

The plugin I'm working on is for Kibana 7.16.3.
The server side code currently looks like the following:
import { schema } from '#kbn/config-schema';
import { logger } from 'elastic-apm-node';
import { IRouter } from '../../../../src/core/server';
import { ComplexityAndChurnFactory } from "../resources/cxchquery";
import { validateBody, linearmap } from "../resources/utility";
let elasticSearchHost = ""
export function defineHosts(host: string) {
elasticSearchHost = host
}
export function defineRoutes(router: IRouter) {
router.get(
{
path: '/api/complexity_and_churn/agg',
validate: {
params: schema.object({}),
body: schema.object({
Size: schema.number({}),
Index: schema.string({}),
StartDate: schema.string({}),
EndDate: schema.string({}),
FileTypeFilters: schema.arrayOf(schema.string({}), {})
}, { })
},
},
async (context, request, response) => {
console.log(`Recv Req: ${JSON.stringify(request.body)}`);
let reqBody = request.body;
validateBody(reqBody);
let query = ComplexityAndChurnFactory(reqBody.Index, reqBody.StartDate, reqBody.EndDate, reqBody.FileTypeFilters, 10000);
let resultSize = reqBody.Size;
let minScore = 0;
let maxScore = 50;
// If the user needs to scan over 10 million files after date range and filtering, there is likely a bigger problem.
const MAX_QUERIES = 1000;
let topXScores: Array<Object> = []
/**Strategy for getting top scores in one pass of the dataset
* Composite aggreggation returns subset of data => update global min/max complexity/churn based on this data.
* Based on global min/max complexity/churn, calculate the score of the composite aggregation subset.
* Based on global min/max complexity/churn, update the score of the previously saved top scores.
* Join the current aggregation subset and previously saved top scores into one dataset.
* Remove all but the top x scores.
* Repeat with previous composite aggregation after key until data is exhausted.
*/
let minComplexity = Number.POSITIVE_INFINITY;
let maxComplexity = Number.NEGATIVE_INFINITY;
let minChurn = Number.POSITIVE_INFINITY;
let maxChurn = Number.NEGATIVE_INFINITY;
let i = 0;
for (i=0; i<MAX_QUERIES; i++)
{
let resp = await context.core.elasticsearch.client.asCurrentUser.search(
query
);
logger.info(`query responded with: ${resp}`);
// Check for completion
let buckets = resp.body.aggregations.buckets.buckets;
if (buckets.length == 0 || !query?.after_key) {
break;
}
// Set up next query if buckets were returned.
query.after_key = resp.body.aggregations.buckets.after_key;
minComplexity = buckets.reduce((p: Object, v: Object)=>p.complexity.value < v.complexity.value? p.complexity.value : v.complexity.value, minComplexity);
maxComplexity = buckets.reduce((p: Object, v: Object)=>p.complexity.value > v.complexity.value? p.complexity.value : v.complexity.value, maxComplexity);
minChurn = buckets.reduce((p: Object, v: Object)=>p.churn.value < v.churn.value? p.churn.value : v.churn.value, minChurn);
maxChurn = buckets.reduce((p: Object, v: Object)=>p.churn.value > v.churn.value? p.churn.value : v.churn.value, maxChurn);
// Recalculate scores for topXScores based on updated min and max complexity and churn.
topXScores.forEach(element => {
let complexityScore = linearmap(element.complexity.value, minComplexity, maxComplexity, minScore, maxScore);
let churnScore = linearmap(element.churn.value, minChurn, maxChurn, minScore, maxScore);
element.score = complexityScore + churnScore;
});
// For new data, calculate score and add to topXScores array.
buckets.forEach(element => {
let complexityScore = linearmap(element.complexity.value, minComplexity, maxComplexity, minScore, maxScore);
let churnScore = linearmap(element.churn.value, minChurn, maxChurn, minScore, maxScore);
element.score = complexityScore + churnScore;
topXScores.push(element);
});
// Sort the topXScores by score.
topXScores = topXScores.sort((a, b) => a.score - b.score);
// Remove all but the top x scores from the array.
let numberBucketsToRemove = Math.max(topXScores.length - resultSize, 0);
topXScores.splice(0, numberBucketsToRemove);
}
if (i == MAX_QUERIES) {
throw new Error(`[ERROR] Exceeded maximum allowed queries (${MAX_QUERIES}) for composite aggregations please reach out to an administrator to get this amount changed or limit your query's date range and filters.`)
}
return response.ok({
body: {
buckets: topXScores
}
});
}
);
}
When I make a request to the endpoint like in the following:
curl --request GET 'http://localhost:5601/api/complexity_and_churn/agg' --header 'kbn-xsrf: anything' --header 'content-type: application/json; charset=utf-8' --header 'Authorization: Basic <Auth>' -d '{
"Size": 100,
"Index": "mainindexfour",
"StartDate": "2010/10/10",
"EndDate": "2022/10/10",
"FileTypeFilters": ["xml"]
}'
I get the response:
{
"statusCode": 400,
"error": "Bad Request",
"message": "[request body.Size]: expected value of type [number] but got [undefined]"
}
If I remove the validation on the body and print out JSON.stringify(request.body), I see that it is an empty object, regardless of what data I send. If I try to use params or query, they also end up being undefined.
Is my server side code or the request I'm sending incorrect?

How to order by child value in Firebase

I have data like this:
scores: {
uid1: {
score: 15,
displayName: "Uciska"
},
uid2: {
score: 3,
displayName: "Bob"
},
uid3: {
etc...
}
}
I want to rank them by score and keep only the top 100.
I did that by following the doc. But it does not work. It always returns the same order even if the score changes.
const query = firebase.database().ref('scores')
.orderByChild('score')
.limitToLast(100)
query.on('child_added', snapshot => {
const score = snapshot.val().score
console.log(score)
})
I added that too in the rules to optimize but I'm not sure it's correct:
"scores": {
".indexOn": ["score"]
}
What is the right way to go?
Your code is correct and should show the desired result.
You may encounter difficulties to see the result due to the 'child_added' event since "the listener is passed a snapshot containing the new child's data", as detailed here in the doc.
You could use the once() method, as follows, which will show the result a bit more clearly since it will display the entire set of scores.
const query = firebase.database().ref('scores')
.orderByChild('score')
.limitToLast(100)
query.once('value', function (snapshot) {
snapshot.forEach(function (childSnapshot) {
var childKey = childSnapshot.key;
var childData = childSnapshot.val();
console.log(childData);
// ...
});
});
Also, your rule can be written as ".indexOn": "score" since there is only one parameter.

Scripted Dashboard in Grafana with opentsdb as the source

I want to create a scripted dashboard that takes one OpenTSDB metric as the datasource. On the Grafana website, I couldn't find any example. I hope I can add some line like:
metric = 'my.metric.name'
into the JavaScript code, and than I can access the dashboard on the fly.
var rows = 1;
var seriesName = 'argName';
if(!_.isUndefined(ARGS.rows)) {
rows = parseInt(ARGS.rows, 10);
}
if(!_.isUndefined(ARGS.name)) {
seriesName = ARGS.name;
}
for (var i = 0; i < rows; i++) {
dashboard.rows.push({
title: 'Scripted Graph ' + i,
height: '300px',
panels: [
{
title: 'Events',
type: 'graph',
span: 12,
fill: 1,
linewidth: 2,
targets: [
{
'target': "randomWalk('" + seriesName + "')"
},
{
'target': "randomWalk('random walk2')"
}
],
}
]
});
}
return dashboard;
Sorry to answer my own question. But I just figured it out and hopefully post here will benefit somebody.
The script is here. Access the dashboard on the fly with:
http://grafana_ip:3000/dashboard/script/donkey.js?name=tsdbmetricname
/* global _ */
/*
* Complex scripted dashboard
* This script generates a dashboard object that Grafana can load. It also takes a number of user
* supplied URL parameters (in the ARGS variable)
*
* Return a dashboard object, or a function
*
* For async scripts, return a function, this function must take a single callback function as argument,
* call this callback function with the dashboard object (look at scripted_async.js for an example)
*/
// accessible variables in this scope
var window, document, ARGS, $, jQuery, moment, kbn;
// Setup some variables
var dashboard;
// All url parameters are available via the ARGS object
var ARGS;
// Intialize a skeleton with nothing but a rows array and service object
dashboard = {
rows : [],
};
// Set a title
dashboard.title = 'From Shrek';
// Set default time
// time can be overriden in the url using from/to parameters, but this is
// handled automatically in grafana core during dashboard initialization
dashboard.time = {
from: "now-6h",
to: "now"
};
var rows = 1;
var metricName = 'argName';
//if(!_.isUndefined(ARGS.rows)) {
// rows = parseInt(ARGS.rows, 10);
//}
if(!_.isUndefined(ARGS.name)) {
metricName = ARGS.name;
}
for (var i = 0; i < rows; i++) {
dashboard.rows.push({
title: metricName,
height: '300px',
panels: [
{
title: metricName,
type: 'graph',
span: 12,
fill: 1,
linewidth: 2,
targets: [
{
"aggregator": "avg",
"downsampleAggregator": "avg",
"errors": {},
"metric":ARGS.name,
//"metric": "search-engine.relevance.latency.mean",
"tags": {
"host": "*"
}
}
],
tooltip: {
shared: true
}
}
]
});
}
return dashboard;

How to query two types of records in CouchDB

I’m having issues getting two dependant types of data from a PouchDB database.
I have a list of cars that I get like so:
localDB.query(function(doc) {
if (doc.type === ‘list’) {
emit(doc);
}
}, {include_docs : true}).then(function(response) {
console.log(“cars”, response);
// Save Cars List to app
for(var i = 0; i < response.rows.length; i++) {
addToCarsList(response.rows[i].id, response.rows[i].carNumber);
}
console.log(“Cars List: " + carsListToString());
return response;
}).then(function(listRecord) {
listRecord.rows.forEach(function(element, index){
console.log(index + ' -> ', element);
localDB.query(function(doc) {
console.log("filtering with carNb = " + element.carNb);
if (doc.type === 'defect' && doc.listId == getCurrentListId() && doc.carNb == element.carNb ) {
emit(doc);
}
}, {include_docs : false}).then(function(result){
console.log("defects", result);
}).catch(function(err){
console.log("an error has occurred", err);
});
});
}).catch(function(err) {
console.log('error', err);
});
Here's what happens. After getting the list of cars, then for each cars I would like to query the defects and store then in some arrays. Then when all that querying is done, I want to build the UI with the data saved.
But what's happening is that the forEach gets processed quickly and does not wait for the inner async'd localDb.query.
How can I query some documents based on an attribute from a parent query? I looked into promises in the PouchDB doc but I can't understand how to do it.
(please forget about curly quotes and possible lint errors, this code was anonymized by hand and ultra simplified)
The method you are looking for is Promise.all() (execute all promises and return when done).
However, your query is already pretty inefficient. It would be better to create a persistent index, otherwise it has to do a full database scan for every query() (!). You can read up on the PouchDB query guide for details.
I would recommend installing the pouchdb-upsert plugin and then doing:
// helper method
function createDesignDoc(name, mapFunction) {
var ddoc = {
_id: '_design/' + name,
views: {}
};
ddoc.views[name] = { map: mapFunction.toString() };
return ddoc;
}
localDB.putIfNotExists(createDesignDoc('my_index', function (doc) {
emit([doc.type, doc.listId, doc.carNb]);
})).then(function () {
// find all docs with type 'list'
return localDB.query('my_index', {
startkey: ['list'],
endkey: ['list', {}],
include_docs: true
});
}).then(function (response) {
console.log("cars", response);
// Save Cars List to app
for(var i = 0; i < response.rows.length; i++) {
addToCarsList(response.rows[i].id, response.rows[i].carNumber);
}
console.log("Cars List: " + carsListToString());
return response;
}).then(function (listRecord) {
return PouchDB.utils.Promise.all(listRecord.rows.map(function (row) {
// find all docs with the given type, listId, carNb
return localDB.query('my_index', {
key: ['defect', getCurrentListId(), row.doc.carNb],
include_docs: true
});
}));
}).then(function (finalResults) {
console.log(finalResults);
}).catch(function(err){
console.log("an error has occurred", err);
});
I'm using a few tricks here:
emit [doc.type, doc.listId, doc.carNb], which allows us to query by type or by type+listId+carNb.
when querying for just the type, we can do {startkey: ['list'], endkey: ['list', {}]}, which matches just those with the type "list" because {} is the "higher" than strings in CouchDB object collation order.
PouchDB.utils.Promise is a "hidden" API, but it's pretty safe to use if you ask me. It's unlikely we'll change it.
Edit Another option is to use the new pouchdb-find plugin, which offers a simplified query API designed to replace the existing map/reduce query() API.
Another approach would be to pull both the list docs and the defect docs down at the same time then merge them together using a reduce like method that will convert them into an array of objects:
{
_id: 1,
type: 'list',
...
defects: [{
type: 'defect'
listId: 1
...
}]
}
By pulling the list and the defects down in one call you save a several calls to the pouchdb query engine, but you do have to iterate through every result to build your collection of lists objects with and embedded array of defects.
// This is untested code so it may not work, but you should get the idea
var _ = require('underscore');
// order documents results by list then defect
var view = function (doc) {
if (doc.type === 'list') {
emit([doc._id, doc.carNumber, 1);
} else if (doc.type === 'defect') {
emit([doc.listId, doc.carNb, 2])
}
}
localDB.query(view, { include_docs: true })
.then(function(response) {
return _(response.rows)
.reduce(function(m, r) {
if (r.key[2] === 1) {
// initialize
r.doc.defects = [];
m.push(r.doc)
return m;
}
if (r.key[2] === 2) {
var list = _(m).last()
if (list._id === r.key[0] && list.carNumber === r.key[1]) {
list.defects.push(r.doc);
}
return m;
}
}, []);
})
.then(function(lists) {
// bind to UI
});
With couch, we found reducing calls to the couch engine to be more performant, but I don't know if this approach is better for PouchDB, but this should work as a solution, especially if you are wanting to embed several collections into one list document.

In Collection.find, how to format .limit, .sort, fieldlist, and variable column names

In non Meteor Server-Side calls to mongodb it is possible make the following chained-option call to the database
collection.find( { myField: { $gte: myOffset } ).limit( myLimit ).sort( { mySortField : 1 } );
where myField, myOffset, myLimit and mySortField may be resolved from elsewhere at run-time.
This pattern is very useful to create such a run-time generated generic query.
Meteor seems to insist on the non-chained options pattern of
collection.find( { { myField: { $gte: myOffset } }, { limit: myLimit, sort: { mySortField : 1 }} );
and I am having problems 'building up' a working Find Query as required above from js objects as described
in previous questions 17362401 and 10959729
Would anyone like to help?
Edited to show usage of variable:
I do it this way. You send two hashes, where the first is the where clause, and all else are peer level keys.
var locations;
var myfield = 'gps';
search = {
sureties: {
$in: sureties
}
}
search[myfield] = {
$near: this.gps,
$maxDistance: kilometers
};
locations = Agents.find(search, {
fields: {
name: 1,
phone: 1
},
limit: limit,
sort: { field1 : 1 }
}).fetch();
The chained pattern is not possible in Meteor, neither server side nor on the client. But the params pattern is as universal, you should be able to create any query you need with those params.

Resources