ElasticSearch. How to get counts for several ranges in one query? - count

Currently I am getting count for a range of values via this query:
$ curl -XGET 'http://localhost:9200/myindex/mytype/_count' -d '{
range:{myfield:{gt:"start_val",lt:"end_val"}}
}
'
Now I have several ranges, and need counts for each range. Can I get them with one query, rather then re-querying each time?
I looked into multi-search with search_type=count But probably it's not the right approach to follow... (it gave me just some aggregated count rather than grouping by... looks like I misused it)
EDIT: I've found that range facet would have been amazing, but unfortunately my values are neither numbers, nor dates, they're just strings...
EDIT2: This is what I ended up with, based on the accepted answer:
$ curl -XGET 'http://localhost:9200/myindex/mytype/_search?search_type=count' -d '{
facets : {
range1 : {
filter : {
range:{myfield:{gt:"start_val1",lt:"end_val1"}}
}
},
range2 : {
filter : {
range:{myfield:{gt:"start_val2",lt:"end_val2"}}
}
}
}
}
'

Heres a link where the creator of ES gives a solution (i.e : several filter facets)
solution
So, one filter facet per range should work alright.
Here's the link toward the doc :
doc api
hope it helps

Related

weaviate aggregate by reference property

I want to build an aggregate query on my data.
I have Patents class that have references of Paragraphs classes (paragraphs that have vectorized text),
I want to count patents for each catagory (property of patent) that are near vector.
in psuedo SQL:
select (count distinct Patent)
from myweaviate
where Paragraph.nearVector(vector, certainty=0.9)
group by catagory
I tried using something like (which is also bad even if it worked because it counts paragraphs):
result = (client.query.aggregate("Paragraph") \
.with_group_by_filter(["inPatent{... on Patent{publicationID}"]) \
.with_fields('meta { count }') \
.with_fields('groupedBy {value}') \
.with_near_vector({'vector': vector, 'certainty': 0.8}) \
.do())
and getting:
{'data': {'Aggregate': {'Paragraph': None}}, 'errors': [{'locations': [{'column': 12, 'line': 1}], 'message': "could not extract groupBy path: Expected a valid property name in 'path' field for the filter, but got 'inPatent{... on Patent{publicationID}'", 'path': ['Aggregate', 'Paragraph']}]}
I couldn't find any source in the docs or in the internet to do something like that, (aka use aggregate on reference property),
additionally, doing a count distinct (but in this case the Patent class is distinct of course)
can anyone help?
unfortunately it is not possible to do grouping by cross-references. The error in your case means that you did not construct a valid path, that is because the path needs to be a list where each item is a valid configuration, i.e. the path should be like this: path: ["inPatent", "Patent", "publicationID"]. It goes property -> class name -> property -> class name -> ... til your desired field. Currently Weaviate does not support Aggregate.groupBy with cross references, if you run your query again with the correct path you should get something like this:
"message": "shard 9wKKa18SJOiM: identify groups: grouping by cross-refs not supported"
Note that it is possible to use the cross reference property as your groupBy path (since you want to Aggregate on the Patent ID, it means that the UUID (and beacon) of the Patent object are unique has a one-to-one mapping to the publicationID ), and it should look like this:
result = (client.query.aggregate("Paragraph") \
.with_group_by_filter(["inPatent"]) \
.with_fields('meta { count }') \
.with_fields('groupedBy {value}') \
.with_near_vector({'vector': vector, 'certainty': 0.8}) \
.do())

Finding JSONPath value by a partial key

I have the following JSON:
{
"Dialog_1": {
"en": {
"label_1595938607000": "Label1",
"newLabel": "Label2"
}
}
}
I want to extract "Label1" by using JSONPath. The problem is that each time I get a JSON with a different number after "label_", and I'm looking for a consistent JSONPath expression that will return the value for any key that begins with "label_" (without knowing in advance the number after the underscore).
It is not possible with JSONPath. EL or Expression Language does not have sch capability.
Besides, I think you need to review your design. Why the variable name is going to be changed all the time? If it is changing then it is data and you need to keep it in a variable. You cannot keep data in data.

Using Multiple Variables to Reference a Sub-Sub-Sub Field in a Lua Dictionary

I'm new to Lua (like, yesterday new), so please bear with me...
I apologize for the convoluted nature of this question, but I had no better idea of how to demonstrate what I'm trying to do:
I have a Lua table being used as a dictionary. The tuples(?) are not numerically indexed, but use mostly string indices. Many of the indices actually relate to sub-tables that contain more detailed information, and some of the indices in those tables relate to still more tables - some of them three or four "levels" deep.
I need to make a function that can search for a specific item description from several "levels" into the dictionary's structure, without knowing ahead of time which keys/sub-keys/sub-sub-keys led me to it. I have tried to do this using variables and for loops, but have run into a problem where two keys in a row are being dynamically tested using these variables.
In the example below, I'm trying to get at the value:
myWarehouselist.Warehouse_North.departments.department_one["rjXO./SS"].item_description
But since I don't know ahead of time that I'm looking in "Warehouse_North", or in "department_one", I run through these alternatives using variables, searching for the specific Item ID "rjXO./SS", and so the reference to that value ends up looking like this:
myWarehouseList[warehouse_key].departments[department_key][myItemID]...?
Basically, the problem I'm having is when I need to put two variables back-to-back in the reference chain of a value being stored at level N of a dictionary. I can't seem to write it out as [x][y], or as [x[y]], or as [x.y] or as [x].[y]... I understand that in Lua, x.y is not the same as x[y] (the former directly references a key by string index "y", while the latter uses the value being stored in variable "y", which could be anything.)
I've tried many different ways and only gotten errors.
What's interesting is that if I use the exact same approach, but add an additional "level" to the dictionary with a constant value, such as ["items"] (under each specific department), it allows me to reference the value without issue, and my script runs fine...
myWarehouseList[warehouse_key].departments[department_key].items[item_key].item_description
Is this how Lua syntax is supposed to look? I've changed the table structure to include that extra layer of "items" under each department, but it seems redundant and unnecessary. Is there a syntactical change that I can make to allow me to use two variables back-to-back in a Lua table value reference chain?
Thanks in advance for any help!
myWarehouseList = {
["Warehouse_North"] = {
["description"] = "The northern warehouse"
,["departments"] = {
["department_one"] = {
["rjXO./SS"] = {
["item_description"] = "A description of item 'rjXO./SS'"
}
}
}
}
,["Warehouse_South"] = {
["description"] = "The southern warehouse"
,["departments"] = {
["department_one"] = {
["rjXO./SX"] = {
["item_description"] = "A description of item 'rjXO./SX'"
}
}
}
}
}
function get_item_description(item_id)
myItemID = item_id
for warehouse_key, warehouse_value in pairs(myWarehouseList) do
for department_key, department_value in pairs(myWarehouseList[warehouse_key].departments) do
for item_key, item_value in pairs(myWarehouseList[warehouse_key].departments[department_key]) do
if item_key == myItemID
then
print(myWarehouseList[warehouse_key].departments[department_key]...?)
-- [department_key[item_key]].item_description?
-- If I had another level above "department_X", with a constant key, I could do it like this:
-- print(
-- "\n\t" .. "Item ID " .. item_key .. " was found in warehouse '" .. warehouse_key .. "'" ..
-- "\n\t" .. "In the department: '" .. dapartment_key .. "'" ..
-- "\n\t" .. "With the description: '" .. myWarehouseList[warehouse_key].departments[department_key].items[item_key].item_description .. "'")
-- but without that extra, constant "level", I can't figure it out :)
else
end
end
end
end
end
If you make full use of your looping variables, you don't need those long index chains. You appear to be relying only on the key variables, but it's actually the value variables that have most of the information you need:
function get_item_description(item_id)
for warehouse_key, warehouse_value in pairs(myWarehouseList) do
for department_key, department_value in pairs(warehouse_value.departments) do
for item_key, item_value in pairs(department_value) do
if item_key == item_id then
print(warehouse_key, department_key, item_value.item_description)
end
end
end
end
end
get_item_description'rjXO./SS'
get_item_description'rjXO./SX'

JQ: Nested JSON Array transformation

Since some month ago i had a little problem with a jq Transformation (j1 1.5 on Windows 10). Since them the command worked excellent: "[{nid, title, nights, company: .operator.shortTitle, zone: .zones[0].title}
+ (.sails[] | { sails_nid: .nid, arrival, departure } )
+ (.sails[].cabins[] | { cabinname: .cabinType.title, cabintype: .cabinType.kindName, cabinnid: .cabinType.nid, catalogPrice, discountPrice, discountPercentage, currency } )]". Since some days ago the api deliver "bigger" json files JSON File. With the jq command i got a lot of duplicates (with the attached file i got around 3146 objects, expected objects are arround 250). I tried to Change the jq command to avoid the duplicates but had no "luck" on that.
The json files contains a variable amount of sails (10 in these case), while each sail has a variable amount of cabins (25 in this case). Any tips how i can realize that? Regards timo
This is probably what you're looking for:
[{nid, title, nights, company: .operator.shortTitle, zone: .zones[0].title}
+ (.sails[] | ({ sails_nid: .nid, arrival, departure } +
(.cabins[] | { cabinname: .cabinType.title,
cabintype: .cabinType.kindName,
cabinnid: .cabinType.nid,
catalogPrice,
discountPrice,
discountPercentage,
currency } ))) ]
Hopefully the layout will clarify the difference with your jq filter.

How to get a list of all venues in Here API?

I am trying to get a list of all venues available through the Here API by requesting a venue index. My url looks as follows(with the strings replaced where necessary):
static-3.venue.maps.cit.api.here.com/1/models-poi/index_bb.js?Policy={Policy}&Signature={Signature}&Key-Pair-Id={Key Pair}&app_id={App Id}&app_code={App Code}
This returns a JSON table, which I want, but I only have 155 entries, although there are clearly more. Does anyone know why I don't get the full list? Thanks. Below is the first couple lines of the output I get.
JSON.venues([{ "gml:id" : "DM_8961", "bb": [ [52.4564118412704,13.384279785476354],[52.454433435991014,13.388207656793229] ]},{ "gml:id" : "DM_10465", "bb": [ [52.43143833815419,13.453328297714588],[52.4288047627723,13.45769174285097] ]},{ "gml:id" : "DM_17394", "bb": [ [52.475570406808345,13.458645521436816],
and so on
You're connecting to a CIT Environment where not all Venue Data is exposed.

Resources