Kibana Average Not as expected - kibana

I am a newbie trying to learn kibana. I have inserted this data..Just passed,failed and skipped count for test cases into elastic.
testMethodsSummary.passed:0 testMethodsSummary.failed:1 testMethodsSummary.skipped:0 _id:AWBP0yDXO9VGNRQOwYSD _type:uc _index:msm _score:1
testMethodsSummary.passed:1 testMethodsSummary.failed:0 testMethodsSummary.skipped:0 _id:AWBP0wHiO9VGNRQOwYSC _type:uc _index:msm _score:1
testMethodsSummary.passed:5 testMethodsSummary.failed:1 testMethodsSummary.skipped:0 _id:AWBP0tthO9VGNRQOwYSB _type:bat _index:msm _score:1
testMethodsSummary.passed:1 testMethodsSummary.failed:0 testMethodsSummary.skipped:6 _id:AWBP0qTxO9VGNRQOwYSA _type:bat _index:msm _score:1
When I query the count come out ok.
"aggregations": {
"total_fail": {
"value": 2
},
"total_skipped": {
"value": 6
},
"total_pass": {
"value": 7
}
}
but when trying to get average .. the average pass is not 7/15 - I don't even know where those numbers are coming from.
"aggregations": {
"avg_fail": {
"value": 0.5
},
"avg_skip": {
"value": 1.5
},
"avg_pass": {
"value": 1.75
}
}
Can anyone please explain ?

Average aggregation in Elasticsearch are calculated over all the documents.
For more info on average aggregation :
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-avg-aggregation.html
aggregations": { "avg_fail": { "value": 0.5 }, "avg_skip": { "value":
1.5 }, "avg_pass": { "value": 1.75 } }
For avg_fail its (sum of all the fail)/(total doc count in the index) i.e. 2/4 = 0.5
Similarly for avg_skip its 6/4 = 1.5 and for avg_pass its 7/4 = 1.75

Related

Is there a way to transform these 2 arrays by using jq, into a set of objects, like in the example down below?

Example json data:
{
"data": [
{
"place": "FM346",
"id": [
"7_day_A",
"7_day_B",
"7_day_C",
"7_day_D"
],
"values": [
0,
30,
23,
43
]
},
{
"place": "LH210",
"id": [
"1_day_A",
"1_day_B",
"1_day_C",
"1_day_D"
],
"values": [
4,
45,
100,
9
]
}
]
}
what i need to transform it into:
{
"data": [
{
"place": "FM346",
"7_day_A": {
"value": 0
},
"7_day_B": {
"value": 30
},
"7_day_C": {
"value": 23
},
"7_day_D": {
"value": 43
}
},
{
"place": "LH210",
"1_day_A": {
"value": 4
},
"1_day_B": {
"value": 45
},
"1_day_C": {
"value": 100
},
"1_day_D": {
"value": 9
}
}
]
}
i have tried this:
{
data:[.data |.[]|
{
place: (.place),
(.id[]):
{
value: (.values[])
}
}]
}
(in jqplay: https://jqplay.org/s/f4BBtN9gwmp)
and this:
{
data:[.data |.[]|
{
place: (.place),
test:
[{
(.id[]):
{
value: (.values[])
}
}]
}]
}
(in jqplay: https://jqplay.org/s/pKIvQe1CzgX)
but they arent grouped in the way i wanted and it gives each value to each id, not the corresponding one.
I have been trying for some time now, but im new to jq and have no idea how to transform it this way, thanks in advance for any answers.
You can use transpose here, which can play a key role in converting the arrays to key/value pairs
.data[] |= {place} +
([ .id, .values ] | transpose | map({(.[0]): { value: .[1] } }) | add)
The solution works by converting the array-of-arrays [.id, .values] by transposing them, i.e. converting
[["7_day_A","7_day_B","7_day_C","7_day_D"],[0,30,23,43]]
[["1_day_A","1_day_B","1_day_C","1_day_D"],[4,45,100,9]]
to
[["7_day_A",0],["7_day_B",30],["7_day_C",23],["7_day_D",43]]
[["1_day_A",4],["1_day_B",45],["1_day_C",100],["1_day_D",9]]
With the transformation done, we construct an object with key as the zeroth index element and value as an object comprising of the value of first index element, and combine the results together with add
Demo - jqplay

Dynamically Parse Child Nodes in JSON

I have a deserialized object that I want to dynamically loop through to return the related results. The response package looks like so:
{"RatingResponse":
{"Success":"true",
"Message":"",
"QuoteID":"57451",
"LoadNum":"57451",
"Rates":
{"Rate":
[
{"SCAC":"test1",
"CarrierName":"TEST1",
"TransitTime":"1",
"ServiceLevel":"D",
"TotalCost":"1,031.82",
"ThirdPartyCharge":"1,031.82",
"Accessorials":
{"Accessorial":
[
{"Code":"400",
"Cost":"1,655.55",
"Description":"Freight"
},
{"Code":"DSC",
"Cost":"-952.77",
"Description":"Discount"
},
{"Code":"FUE",
"Cost":"329.04",
"Description":"Fuel Surcharge"
}
]
},
"QuoteNumber":""
},
{"SCAC":"test2",
"CarrierName":"TEST2",
"TransitTime":"1",
"ServiceLevel":"D",
"TotalCost":"1,031.82",
"ThirdPartyCharge":"1,031.82",
"Accessorials":
{"Accessorial":
[
{"Code":"400",
"Cost":"1,655.55",
"Description":"Freight"
},
{"Code":"DSC",
"Cost":"-952.77",
"Description":"Discount"
},
{"Code":"FUE",
"Cost":"329.04",
"Description":"Fuel Surcharge"
}
]
},
"QuoteNumber":""
}
]
},
"AverageTotalCost":"1,031.82"
}
}
I have parsed the response data so that there is less information to work with, especially since I only need the Accessorial Costs. The parsed response looks like
[
{
"SCAC": "test1",
"CarrierName": "TEST1",
"TransitTime": "1",
"ServiceLevel": "D",
"TotalCost": "1,031.82",
"ThirdPartyCharge": "1,031.82",
"Accessorials": {
"Accessorial": [
{
"Code": "400",
"Cost": "1,655.55",
"Description": "Freight"
},
{
"Code": "DSC",
"Cost": "-952.77",
"Description": "Discount"
},
{
"Code": "FUE",
"Cost": "329.04",
"Description": "Fuel Surcharge"
}
]
},
"QuoteNumber": ""
},
{
"SCAC": "test2",
"CarrierName": "TEST2",
"TransitTime": "1",
"ServiceLevel": "D",
"TotalCost": "1,031.82",
"ThirdPartyCharge": "1,031.82",
"Accessorials": {
"Accessorial": [
{
"Code": "400",
"Cost": "1,655.55",
"Description": "Freight"
},
{
"Code": "DSC",
"Cost": "-952.77",
"Description": "Discount"
},
{
"Code": "FUE",
"Cost": "329.04",
"Description": "Fuel Surcharge"
}
]
},
"QuoteNumber": ""
}
]
The problem I am facing is that I will never know how many Rate items will come back in the response data, nor will I know the exact amount of Accessorial Costs. I'm hoping to capture the Rate child node counts and the Accessorial child node counts per Rate. Here's what I have so far.
Root rootObject = Newtonsoft.Json.JsonConvert.DeserializeObject<Root>(responseFromServer);
//rate stores the parsed response data
JArray rate = (JArray)JObject.Parse(responseFromServer)["RatingResponse"]["Rates"]["Rate"];
var rate2 = rate.ToString();
//this for loop works as expected. it grabs the number of Rate nodes (in this example, 2)
for (int i = 0; i < rate.Count(); i++)
{
dynamic test2 = rate[i];
//this is where I'm struggling
dynamic em = (JArray)JObject.Parse(test2)["Accessorials"]["Accessorial"].Count();
for (int j = 0; j < em; j++)
{
string test3 = test2.Accessorials.Accessorial[j].Cost;
System.IO.File.AppendAllText(logPath, Environment.NewLine + test3 + Environment.NewLine);
}
}
I apologize in advance for the bad formatting and odd variable names - I'm obviously still testing the functionality, so I've been using random variables.
Where I'm struggling (as notated above) is getting to the Accessorial node to count how many items are in its array. I was thinking I could parse the first array (starting with SCAC data) and extend down to the Accessorial node, but I'm not having any luck.
Any help is GREATLY appreciated, especially since I am new to this type of code and have spent the majority of the day trying to resolve this.
you can try this
var rates = (JArray)JObject.Parse(json)["RatingResponse"]["Rates"]["Rate"];
var costs = rates.Select(r => new
{
CarrierName = r["CarrierName"],
Costs = ((JArray)((JObject)r["Accessorials"])["Accessorial"])
.Where(r => (string)r["Description"] != "Discount")
.Select(r => (double)r["Cost"]).Sum()
}).ToList();
result
[
{
"CarrierName": "TEST1",
"Costs": 1984.59
},
{
"CarrierName": "TEST2",
"Costs": 1984.59
}
]

Cosmos DB SQL API query for children of nested objects

I would like to find a better way to search for if documents in a collection have a property with more than 0 elements in the array, i.e. anything that isn't empty.
such as: select * from c where c.property = 'x' and array_length(c.child) > 0 and array_length(c.child.grandchild) > 0
The first arraylength works. Adding the second with just this dot notation doesn't work as I read somewhere else. How can I ensure that I can accomplish this. The grandchild will be anywhere from 0 to many number where it has a greater array length than 0.
Please let me know if more clarification is needed.
Please use below sql :
SELECT distinct c.id,c.name,c.child FROM c
join child in c.child
where array_length(c.child) > 0
and array_length(child.grandchild) > 0
My sample documents:
[
{
"id": "1",
"name": "Jay",
"child": [
{
"name": "A",
"grandchild": [
{
"name": "A1"
},
{
"name": "A2"
}
]
},
{
"name": "B",
"grandchild": [
{
"name": "B1"
},
{
"name": "B2"
}
]
}
]
},
{
"id": "2",
"name": "Tom",
"child": [
{
"name": "A",
"grandchild": []
},
{
"name": "B",
"grandchild": []
}
]
}
]
Hope it helps you.

Create an object with specified indexes

I am trying to use for loop for every object using jq.
Sample Input generated by Elasticsearch
{
"took": 202,
"timed_out": false,
"aggregations": {
"aggsDateHistogram": {
"buckets": [
{
"key": 1465974236000,
"search": {
"value": 14
}
},
{
"key": 1465975137000,
"search": {
"value": 16
}
}
]
}
}
}
I want to have an object that has a key value and corresponding value of value index from search.
{ "date": .aggregations.aggsDateHistogram.buckets[].key, "value": .aggregations.aggsDateHistogram.buckets[].search.value }
This gives me an object but with cartesian product, but I only want to have values like
key[1] : search[1].value
key[2] : search[2].value
So you want to produce this output?
[
{
"key": 1465974236000,
"value": 14
},
{
"key": 1465975137000,
"value": 16
}
]
The following will do just that:
.aggregations[].buckets
| map({key: .key, value: .search.value})
And from a terminal:
jq '.aggregations[].buckets
| map({key: .key, value: .search.value})' input.json
Here is a slightly simpler solution
[ .aggregations[].buckets[] | {key, value:.search.value} ]

Elastic : make a light count query (vs search query)

I am accessing bulk data in elastic through R. For analytics purpose I need to query for data for a relatively long duration (say a month). The data for a month is approx 4.5 million rows and R goes out of memory.
Sample data is below (for 1 day):
dt <- as.Date("2015-09-01", "%Y-%m-%d")
frmdt <- strftime(dt,"%Y-%m-%d")
todt <- as.Date(dt+1)
todt <- strftime(todt,"%Y-%m-%d")
connect(es_base="http://xx.yy.zzz.kk")
start_date <- as.integer(as.POSIXct(frmdt))*1000
end_date <- as.integer(as.POSIXct(todt))*1000
query <- sprintf('{"query":{"range":{"time":{"gte":"%s","lte":"%s"}}}}',start_date,end_date)
s_list <- elastic::Search(index = "organised_2015_09",type = "PROPERTY_SEARCH", body=query ,
fields = c("trackId", "time"), size=1000000)$hits$hits
length(s_list)
[1] 144612
This result for 1 day has 144k records and is 222 MB. Sample list item below:
> s_list[[1]]
$`_index`
[1] "organised_2015_09"
$`_type`
[1] "PROPERTY_SEARCH"
$`_id`
[1] "1441122918941"
$`_version`
[1] 1
$`_score`
[1] 1
$fields
$fields$time
$fields$time[[1]]
[1] 1441122918941
$fields$trackId
$fields$trackId[[1]]
[1] "fd4b4ce88101e58623ba9e6e31971d1f"
Actually a summary count of number of items by "trackId" and "time" (summarize for every day) would suffice for analytics purpose. Hence I tried to transform this into a count query with aggregations. So I constructed the below query:
query < -'{"size" : 0,
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"range": {
"time": {
"gte": 1441045800000,
"lte": 1443551400000
}
}
}
}
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "time",
"interval": "day",
"time_zone": "+05:30"
},
"aggs": {
"group_by_state": {
"terms": {
"field": "trackId",
"size": 0
}
}
}
}
}
}'
response <- elastic::Search(index="organised_recent",type="PROPERTY_SEARCH",body=query, search_type="count")
However I did not gain in speed or document size. i think I am missing something but not sure what.

Resources