Connect to Elasticsearch server using R - r

I am quite familiar with using Python to connect with Elasticsearch server, but as I am exploring the R language, I am stuck at this connecting part. I usually use this kind of code to connect with the server in Python:
import requests
import json
RM_URL = "http://es-int-client-1.senvpc:9200/rm_201609/_search?timeout=10000"
payload = {
"size": 10000000,
"query": {
"filtered": {
"filter" : {
"bool": {
"must": [
{"term": {"events.id": str(event_id)}},
{"range": {"score_content_0": {"gte": score}} },
{"range": {"published_at": { "gte": str(start_date+"T00:00:00"),
"lte": str(end_date+"T23:59:59")}}},
{"term": {"lang": la}}
]
}
}
}
}
}
r = requests.post(RM_URL, json=payload)
results = json.loads(r.content, encoding='utf-8')
I would be glad if anyone could show me how to do the same in R, thanks!

Related

Cannot Find Service - GHZ to load test GRPC Service

I'm trying to test a GRPC Service using GHZ. However, I get the error -
Cannot find service "com.server.grpc.Executor"
Config.json file:
"proto": "/Users/dev/Desktop/ghz/execute.proto",
"call": "com.server.grpc.Executor.execute",
"total": 2000,
"concurrency": 50,
"data": {
"param1": "test-data1",
"param2": "test-data2",
},
"max-duration": "10s",
"host": "<ip-address>:9090",
"c": 10,
"n": 200
}
proto file:
option java_package= "com.server.grpc";
option java_multiple_files = true;
service Executor {
rpc execute(ExecuteRequest) returns (ExecuteResponse);
}
message ExecuteRequest {
string param1 = 1;
string param2= 2;
}
message ExecuteResponse {
bool res = 1;
string msg = 2;
}
Running using command: ghz --config=<path/to/config>/config.json
Is there anything I'm missing?
Your protobuf file should contain e.g.:
syntax = "proto3";
package example;
...
Then, your service would be fully-qualified by example.Executor.execute not com.server.grpc.Execute.execute which is a language-specific (I assume, Java by your option) fully-qualified name.
I assume you unintentionally omitted the opening brace ({) of the JSON file but that, of course, is required.
JSON is challenging your "param2": "test-data2" must not be terminated with , because it's the last item in the list; so drop that comma.
{
"proto": "/Users/dev/Desktop/ghz/execute.proto",
"call": "example.Executor.execute",
"total": 2000,
"concurrency": 50,
"data": {
"param1": "test-data1",
"param2": "test-data2"
},
"max-duration": "10s",
"host": "<ip-address>:9090",
"c": 10,
"n": 200
}
Assuming your service is running on <ip-address>:9090, that should then work!

Postman Schema Validation using TV4

I'm having trouble validating a schema in Postman using tv4 inside the tests tab - it is always returning a true test, no matter what I feed it. I am at a complete loss and could really use a hand - here is my example JSON Response, and my tests:
I've tried a ton of variations from every Stack Overflow/tutorial I could find and nothing will work - it always returns true.
//Test Example
var jsonData = JSON.parse(responseBody);
const schema = {
"required" : ["categories"],
"properties": {
"categories": {
"required" : ["aStringOne", "aStringTwo", "aStringThree" ],
"type": "array",
"properties" : {
"aStringOne": {"type": "string" },
"aStringTwo": {"type": "null" },
"aStringThree": {"type": "boolean" }
}
}
}
};
pm.test('Schema is present and accurate', () => {
var result=tv4.validateMultiple(jsonData, schema);
console.log(result);
pm.expect(result.valid).to.be.true;
});
//Response Example
{
"categories": [
{
"aStringOne": "31000",
"aStringTwo": "Yarp",
"aStringThree": "More Yarp Indeed"
}
]
}
This should return false, as all three properties are strings but its passing. I'm willing to use a different validator or another technique as long as I can export it as a postman collection to use with newman in my CI/CD process. I look forward to any help you can give.
I would suggest moving away from using tv4 in Postman, the project isn't actively supported and Postman now includes a better (in my opinion), more actively maintained option called Ajv.
The syntax is slightly different but hopefully, this gives you an idea of how it could work for you.
I've mocked out your data and just added everything into the Tests tab - If you change the jsonData variable to pm.response.json() it will run against the actual response body.
var jsonData = {
"categories": [
{
"aStringOne": "31000",
"aStringTwo": "Yarp",
"aStringThree": "More Yarp Indeed"
}
]
}
var Ajv = require('ajv'),
ajv = new Ajv({logger: console, allErrors: true}),
schema = {
"type": "object",
"required": [ "categories"],
"properties": {
"categories": {
"type": "array",
"items": {
"type": "object",
"required": [ "aStringOne", "aStringTwo", "aStringThree" ],
"properties": {
"aStringOne": { "type": "string" },
"aStringTwo": { "type": "integer"},
"aStringThree": { "type": "boolean"},
}
}
}
}
}
pm.test('Schema is valid', function() {
pm.expect(ajv.validate(schema, jsonData), JSON.stringify(ajv.errors)).to.be.true
});
This is an example of it failing, I've included the allErrors flag so that it will return all the errors rather than just the first one it sees. In the pm.expect() method, I've added JSON.stringify(ajv.errors) so you can see the error in the Test Result tab. It's a little bit messy and could be tidied up but all the error information is there.
Setting the properties to string show the validation passing:
If one of the required Keys is not there, it will also error for this too:
Working with schemas is quite difficult and it's not easy to both create them (nested arrays and objects are tricky) and ensure they are doing what you want to do.
There are occasions where I thought something should fail and it passed the validation test. It just takes a bit of learning/practising and once you understand the schema structures, they can become extremely useful.

API Gateway and DynamoDB PutItem for String Set

I can't seem to find how to correctly call PutItem for a StringSet in DynamoDB through API Gateway. If I call it like I would for a List of Maps, then I get objects returned. Example data is below.
{
"eventId": "Lorem",
"eventName": "Lorem",
"companies": [
{
"companyId": "Lorem",
"companyName": "Lorem"
}
],
"eventTags": [
"Lorem",
"Lorem"
]
}
And my example template call for companies:
"companies" : {
"L": [
#foreach($elem in $inputRoot.companies) {
"M": {
"companyId": {
"S": "$elem.companyId"
},
"companyName": {
"S": "$elem.companyName"
}
}
} #if($foreach.hasNext),#end
#end
]
}
I've tried to call it with String Set listed, but it errors out still and tells me that "Start of structure or map found where not expected" or that serialization failed.
"eventTags" : {
"SS": [
#foreach($elem in $inputRoot.eventTags) {
"S":"$elem"
} #if($foreach.hasNext),#end
#end
]
}
What is the proper way to call PutItem for converting an array of strings to a String Set?
If you are using JavaScript AWS SDK, you can use document client API (docClient.createSet) to store the SET data type.
docClient.createSet - converts the array into SET data type
var docClient = new AWS.DynamoDB.DocumentClient();
var params = {
TableName:table,
Item:{
"yearkey": year,
"title": title
"product" : docClient.createSet(['milk','veg'])
}
};

getting binary data when using POST request in httr package

I am using the POST function in httr library to get some data and the code is shown below.
library(httr)
url = "https://xxxx:xxx#api.xxx/_search" #omitted for privacy
a = POST(url,body = query,encode = "json")
The query is shown below in the appendix. a$content is giving me a whole bunch of a hexadecimal numbers on which I have to use another function before I can get some useful data.
Ultimately I wish to get a data frame by using b = fromJSON(a$content). So far in order to get any data I have to use:
chr<-function(n){rawToChar(as.raw(n))}
b = jsonlite::fromJSON(chr(a$content))
data = b$hits$hits$`_source`
This seems inefficient considering that I am parsing in the data through a local function to get the final data. So my questions are as follows:
Am I using the POST function correctly to get the query?
Is there a more efficient (faster) way of getting my data into a data frame?
Appendix:
query = '
{
"_source": [
"start","source.country_codes",
"dest.country_codes"
],
"size": 100,
"query": {
"bool": {
"must": [
{
"bool": {
"must_not": [
{
"range": {
"start": {
"lte": "2013-01-01T00:00:00"
}
}
},
{
"range": {
"start": {
"gt": "2016-05-19T00:00:00"
}
}
}
]
}
}
]
}
}
}'
POST function looks good.
js<-fromJSON(content(a,as="text"))

ES Spring data add size into spring data format

I am trying to convert my JSON ES query to Java I am using Spring data. I am almost there but the problem is i cannot get the "size": 0 into my query in Java.
GET someserver/_search
{
"query": { ...},
"size": 0,
"aggregations" : {
"parent_aggregation" : {
"terms" : {
"field": "fs.id"
},
"aggs": {
"sub_aggs" : {
"top_hits": {
"sort": [
{
"fs.smallVersion": {
"order": "desc"
}
}
],
"size": 1
}
}
}
}
}
}
In Java I am building an NativeSearchQuery object on which I think it should be possible to set the size?
NativeSearchQuery searchQuery = createNativeSearchQuery(data, validIndices, query, filter);
es.getElasticsearchTemplate().query(searchQuery, response -> extractResult(data, response));
If you are using elasticsearch before 2.0, you can use the search type feature to do what you want, which is not return docs. This can be accomplished using the NativeSearchQueryBuilder. If you set the SearchType to COUNT you will not get docs. Beware that in elasticsearch 2.x this is deprecated and you should use the size is 0. If the spring elasticsearch project will support elasticsearch 2.0 this will most likely change and the size attribute should be exposed in this builder as well.
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(matchAllQuery())
.withSearchType(SearchType.COUNT)
.withIndices("yourindex")
.addAggregation(terms("nameofagg").field("thefield"))
.build();

Resources